| Р chological Моз hd 
| General and Applied «w; 
zjm 


*. Combining the Applied Psychology Monographs and the Archives of Psychology | 
with the Psychological Monographs hi 


Vor. 75 
1961 y 
m 
"NongMaN L. Munn, Editor. 4 
Bowdoin College Y 
“- Brunswick, Maine > 


Consulting Editors 


‚ ANNE ANASTASI Francis W. Irwin E 
FRANK A. BEACH James J. JENKINS х E 
ARNOLD M. BINDER Boyp McCanpLess 
W. J. BROGDEN Donatp W. MacKinnon 
Rosert R. Воѕн Quinn McNEMAR 
Jonn F. DASHIELL LonniN A. Riccs 

_ James J. Ствѕом Canr К. ROGERS 

~ D. О. Hess Davip SHAKOW 

f DNA HEIDBREDER RICHARD L. SOLOMON 
"9. Joun E. HonRocks Ross STAGNER 
4 б ARTHUR C. HorrMan, Production Manager 
H^ HELEN Orr, Promotion Manager 
Published by 


'THE AMERICAN PSYCHOLOGICAL ASSOCIATION, INC. 
1333 SIXTEENTH STREET, N.W., WASHINGTON 6, D.C. 


С. СЬ CONTENTS OF VOLUME 75 


Whole No. 

505 CoNNorATIvE MEANING AS A DETERMINANT OF STIMULUS GENERALIZATION. Charles Е. Dicken 

506 Some ANTECEDENTS AND CONSEQUENTS OF MASCULINE SEx-TvPrNG IN Арогкѕсемт Boys. Раш 
Mussen. 

507 Ам ASSESSMENT OF THE DraGNostic Process ІХ A CHILD GurDANCE Serine. Philip A. Магі 

508 A Survey or Рѕүсногосіѕтѕ IN Community MENTAL HEALTH: ACTIVITIES AND OPINIONS 0 
Epucation Nerens. Ascanio M. Rossi, Donald C. Klein, John M. vonFelsinger, and Thomas F. 4 
Plaut. 

509 RELATIONS BETWEEN HOME EXPERIENCE AND CHILDREN'S USE or LANGUAGE IN PLAY INTERACTIONS 
wirH Peers. Helen К. Marshall. 

510 PSYCHOLOGICAL; STUDY or tHE PATIENT WITH PULMONARY TUBERCULOSIS: A COOPERATIVE 
SEARCH APPROACH. Claire M. Vernier, Robert P. Barrell, Jonathan W. Cummings, Joseph Ё 
Dickerson, and H. Elston Hooper. 

511 Tue Process or Group PSYCHOTHERAPY RELATIONSHIPS BETWEEN HYPOTHESIZED THERAPEUTÉC: 
CONDITIONS AND INTRAPERSONAL EXPLORATION. Charles B. Truax. 

512 Tue DISTRIBUTION or SUSCEPTIBILITY TO Hypnosis IN A STUDENT PoruLATION: A Srupy USING 
THE STANFORD HyPNorrc SuscPrIBILITY Scale, Ernest К. Hilgard, André M. Weitzenhoffé 
Judah Landes, and Rosemarie K. Moore. 

513 Tue Response TO THREAT: RELATIONS AMONG VERBAL AND PnvsioLocICAL Inprces. Georg 
Mandler, Jean M. Mandler, Irwin Kremen, and Robert D. Sholiton. 

514 RrsmuA or SHock-TRAUMA IN THE Warre RAT: A Тнвкк-Елсток 'Тнкоңү. Kenneth H. Brook 
shire, Richard A. Littman, and Charles N. Stewart. 

515 AN EXPERIMENTAL ANALYSIS or AssociATIVE FACTORS IN MEDIATED GENERALIZATION. David Ё 
Horton and Paul M. Kjeldergaard. 

516 LEADER INDULGENCE AND Group PERFORMANCE. F, Loyal Greer. 

517 LEARNING AND MEMORIZATION ОЕ CLASSIFICATIONS. Roger N. Shepard, Carl I. Hovland, am 
Herbert M. Jenkins. d 

518 Авплтєз AND LrARNING Sets IN KwowrepcE Acọouisrmon. Robert M. Gagné and Noel E. 
Paradise. 

519 A Comparative AND ANALYTICAL Srupy or Visuat Dreru Perception. Richard D. Walk and 


Eleanor J. Gibson. 


Vol. 75, No. 1 


HIS study tests some implications of 

Charles Osgood's mediation theory of 
meaning (Osgood, 1952; Osgood, Suci, & 
Tannenbaum, 1957) and assesses the Se- 
mantic Differential (Osgood & Suci, 1955) 
as an instrument for the measurement of 
the meaning of verbal stimuli. The prin- 
cipal experiments concern mediated gener- 
alization among words selected for meaning 
relationships by the Differential. 


Osgood (1952) defines "meaning" as a 
representational mediation process, a theo- 
retical construct with the functional prop- 
| erties of an implicit, cue-producing response : 


1. Stimulus objects elicit a complex pattern of 
reactions from the organism, these reactions vary- 
ing in their dependence upon the presence of the 
Stimulus-object for their occurrence. . . . 

2. When stimuli other than the stimulus-object 
(e.g. a sign) but previously associated with it are 
later presented without its support, they tend to 
elicit some reduced portion of the total behavior 
elicited by the stimulus object... . 

3. The fraction of the total object-elicited be- 
havior which finally constitutes the stable media- 
tion process elicted by a sign will tend toward a 
minimum set by the discrimination capacity of 
the organism. This is because the sole function 
of such mediating reactions in behavior is to pro- 
vide a distinctive pattern of self-stimulation (cf. 
Hull's conception of the “pure stimulus act"). 

4. The self-stimulation produced by sign-elicited 
mediation processes becomes conditioned in vary- 
ing strengths to the initial responses in hierarchies 
of instrumental. skill sequences. This mediated 
elf-stimulation is assumed to provide the “way 
perceiving” signs or their "meaning" as well 


1 Drawn from a doctoral dissertation submitted 

o the University of Minnesota. The author is 

grateful to James Jenkins and Wallace Russell for 

advice and criticism, A report based on some of 

the data was read at the meeting of the Western 

(айын Association in Monterrey, California, 
8. x 


? Now at the University of Chicago. 
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as mediating instrumental skill sequences—behaviors 
to signs which take account of the objects repre- 
sented (pp. 203-204). ; 

Figure la illustrates the conditioning of 
a sign ([S]) to the mediation process 
(TmSm) when occurs continuously with 
Кт, the total response of the organism to a 
stimulus-object S. Figure lb shows the 
conditioning of the sign and the mediator 
to an instrumental response Rx. 


S Rz 
(a) УМ 
TmSm 


(b) 


Fic. 1. Meaning as representational mediation 
process, 


Imm Rx 


Two or more signs acquire similar media- 
tion processes if they are associated with Ss 
which elicit similar Ry’s. Such signs аге 
considered to have similar meanings. Medi- 
ated stimulus generalization would be ex- 
pected to occur among meaningfully similar 
signs because of the similarity of the cue 
properties of the mediators, even though the 
signs are not related in any primary stimu- 
lus dimension. 


Figure 2 illustrates generalization be- 
tween two signs with similar mediation 
processes. A mediation paradigm such as 
this can be used to explain the generaliza- 
tion observed among meaningfully similar 
words in conditioning experiments (Diven, 
1937; Razran, 1935-36, 1949; Riess, 1940) 
and in studies of transfer and retroaction 
(Foley & Cofer, 1943; Haagen, 1943; Mc- 
Geoch & McDonald, 1931; Morgan & 
Underwood, 1950; Osgood, 1946; Young 
& Underwood, 1954; Yum, 1931). Previous 


Original Learning 


——— RR 
=... ROC 
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Generalization 


TmSm Ar? Rx 
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Fic. 2. Mediated stimulus generalization between 
signs of similar meaning. 


experimentation has been limited by lack 
of a unifying conceptualization and by ab- 
sence of satisfactory quantitative treatment 
of the meaning variable. The authors cited 
assessed meaning relationships by formal 
dictionary definitions, similarity of logical 
or content category, identity of meaning in 
different languages, or by judgments of 
similarity of meaning or feeling tone. The 
most refined technique (Haagen, 1943) em- 
ployed scaled judgments of meaning simi- 
larity in sets of adjectives. No study used 
comparisons of independent measurements 
of the meaning properties of individual 
words in determining meaning relationships. 

The Semantic Differential has been de- 
scribed in detail elsewhere (Osgood, 1952; 
Osgood & Suci, 1955; Osgood, Suci, & 
Tannenbaum, 1957). The underlying as- 
sumptions (Osgood, 1952) are stated as 
follows: 

1. The process of description or judgment can be 
conceived as the allocation of a concept to an 
experiential continuum, definable by a pair of 
polar terms. ... 

2. Many different experiential continua, or ways 
in which meanings vary, are essentially equivalent 


and hence may be represented by a single dimen- 
sion. . 


3. A limited number of such continua can be 
used to define a semantic space within which the 
meaning of any concept can be specified (p. 227). 

Judges typically rate words or other 
stimulus items on each of several seven- 
step continua whose end points are defined 
by “polar” adjective pairs. Osgood and 
Suci (1955) found that most of the covari- 
ance of 50 polar scales used in obtaining 
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judgments of the meaning of 20 verbal 
concepts could be accounted for by a small 
number of factors. Three major factors 
which appeared in each of two analyses 
were labeled evaluation, potency, and ac- 
tivity. Osgood and Suci interpreted their 
findings as confirming Assumptions 2 and 
3 above. They considered the meaning 
dimensions identified to be primarily con- 
notative in character, and admitted that 
such dimensions may not assess other 
semantic attributes (e.g., denotation) which 
are probably much more numerous. 

Appendix A shows a 20-scale form of the 
Differential constructed by Jenkins, Russell, 
and Suci (1958). Eight scales represent 
primarily an evaluative dimension, and each 
of the other major factors, potency and 
activity, are represented by three scales. 
The remaining scales relate partly to the 
major factors and partly to five minor fac- 
tors. Factor loadings and communalities 
of the scales (Jenkins et al, 1958) are 
shown in Appendix B. 

The set of judgmental responses to а 
word obtained by use of a multiscale form 
of the Differential determines a connotative 
meaning profile for the word. Let this set 
of responses be termed Rsp When Rsp 
is a mean profile obtained from a group 
of judges, it represents the stable, con- 
sensual components of the word's connota- 
tive meaning within the linguistic-cultural 
group from which the judges were drawn. 
Similarity of meaning of words can be 
estimated by comparison of these profiles. 
Mean profiles for two connotatively similar 
words are plotted on the scales in Ap- 
pendix A. 

Osgood and Suci (1952) defined an inde 
of profile similarity. 


п 
D-4/ 3 аг 
t 


where dj is the difference between tw 
scaled stimulus items on Scale i, and n i 
the number of scales. The index reflec 
profile differences in absolute elevation ol 
position as well as configural difference: 
In the present context, D may be used td) 
specify the connotative dissimilarity of 
pair of words: ie, the words’ semanti 
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ү "distance" in a 20-dimensional space defined 
by the scales used to determine the mean 
profiles. Small values of D reflect similarity 
of mean profile and hence similarity of 
measured connotative meaning. 

Three related hypotheses are examined 
in the present study: (a) The meaning of 
a word stimulus may be identified with a 
representational process, (гиз), which 
functions to mediate generalization among 
meaningfully related stimuli. (b) Rsp is an 
operational index of r,s,; words evoking 
similar Rgp’s evoke similar r,s,'s. (c) The 
Osgood-Suci D applied to Semantic Differ- 
ential profiles is a measure of similarity of 
Rsp and hence of TynSm. 

If these hypotheses are correct, words 
whose Semantic Differential profiles are 
similar (low values of D) would be ex- 
pected to demonstrate mediated stimulus 
generalization in the manner illustrated in 
Figure 2. The mediator TmSm, the meaning 
measurement Rsp, and the similarity index 
D are central constructs in the Osgood 
meaning theory. Evidence of generalization 
among words selected solely for scaled simi- 
larity values of Semantic Differential pro- 
files would substantiate the usefulness of 
these constructs. 

Initial evidence on the relationship of 
Rsp to verbal generalization was reported 
by Ryan (1957). He compared the effect 
of previously established word association 
habits with that of similarity of Differential 
profile (measured by D) in facilitation of 
transfer in paired-associate learning. Inter- 
polated list stimulus words which were re- 
lated by association habits to original list 
stimuli demonstrated the greatest transfer. 
Transfer was, however, also observed for 
interpolated list stimulus words related to 
original list stimuli only by similarity of 
Differential profile. Ryan’s data suggested 
the profitability of further investigation of 
Rsp and verbal generalization, and indicated 
the need for attention to the associative 
factor. 

A technique which lends itself readily to 
problems of generalization among verbal 
stimuli had been worked out by Mink 
(1957). In a study of associative deter- 


minants of verbal generalization, Mink 


trained subjects to press a lever in response 
to words which were the stimulus terms 
(X) of word pairs which had been found 
associatively related in a free-association 
study (Russell & Jenkins, 1954). He found 
that the lever response generalized to the 
response terms (Y) of the (X > Y) asso- 
ciative pairs. Mink's experimental pro- 
cedures were adapted for the present 
experiments. Details of his findings will 
be referred to later in relation to the role of 
the associative factor in the present study. 


METHOD 


Six generalization experiments were per- 
formed with the same design and pro- 
cedures. In an original learning phase, 
subjects performed a simple lever-press re- 
sponse upon visual presentation of each 
of a set of word stimuli. Trials were rapid 
and few to preclude complete learning. 
The original words, words similar in mean- 
ing to these, and control words were then 
presented as a generalization test. Subjects 
were instructed to respond to words they 
believed they recognized from the original 
set. Response to words similar in meaning 
to the original words in excess of response 
to control words was the measure of mean- 
ing-determined stimulus generalization. De- 
tails of procedure common to all experiments 
follow; variations introduced in individual 
experiments and procedures used in supple- 
mentary studies of the stimuli are discussed 
later. 


Selection and Arrangement of Stimuli 


The stimuli were drawn from a population of 
360 verbal concepts? for which Jenkins, Russell, 
and Suci (1958) had obtained mean Semantic 
Differential profiles. Each mean profile was deter- 
mined by the judgments of 15 male and 15 female 
college students recorded on the scales shown in 
Appendix А. Test-retest reliability of mean scale 
values for 20 randomly selected concepts was .97. 

Jenkins, Russell, and Suci (1959) compared all 
possible pairs of concepts in the 360-word popula- 
tion (64,620 pairs) with the Osgood-Suci D index. 


* The majority of the concepts were single 
words. The small number of two-word concepts 
included in the population was disregarded in 
selecting stimuli for the present experiments. 
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ТАВГЕ 1 
WORDS AND INTRACLUSTER D VALUES 
(Clusters А, B, M, and N) 
Rank Word 1 2 3 4 5 6 7 8 
Cluster A 
1 WOMAN 
2 FRAGRANT 2371 
3 PRETTY 2237 2420 
4 GIRL 1868 2424 1940 
5 LOVELY 2530 2013 2248 2798 
6 CHARMING 2092 2363 2241 2790 2229 
7 BODICE 2114 2323 2428 2028 2848 2951 
8 SONG hs 2644 2127 2683 2770 2673 2480 2458 
D 2265 2292 2313 2374 2444 2449 2450 2543 
Cluster mean 2391 
Cluster B 
1 TROUBLE 
2 HURT 1612 
3 SCORCHING 2233 2473 
4 PAIN 1998 1829 2893 
5 MAD 2327 2561 2631 2904 
6 DANGER 2573 2466 2634 2448 2704 
7 SCALDING 2891 2764 2045 3087 2658 2525 
8 FRIGHTFUL _ 2732 2830 2597 2772 2200 3209 2651 
D 2338 2362 2501 2562 2569 2651 2660 2713 
Cluster mean 2543 
Cluster M 
1 PROGRESS 
2 EFFORT 1572 
3 STUDY 2223 1681 
4 INCOME 2162 1957 2533 
5 MONEY 2563 2176 3082 2264 
6 UP 3009 2168 2401 3061 2856 
7 PATRIOT 2490 2282 2944 3007 3047 2842 
8 LIFT A 3037 2405 3067 2931 2363 2242 2619 
D 2249 2381 2562 2563 2617 2654 2747 2752 
Cluster mean 2566 
Cluster V 
1 BAD 
2 THIEF 2267 
3 CROOKED 2143 2556 
4 FRAUD 2089 2334 2110 
5 STEAL 1904 1680 2719 2496 
6 CRIMINAL 2805 1630 2787 2855 2641 
7 NASTY 2517 | 3052 | 2423 | 2701 | 3181 | 2724 
8 HATE id 2325 2915 3120 3069 3311 2496 2531 
D 2331 2382 2471 2501 2591 2647 2687 2824 
Cluster mean 2554 


Note.—D values are shown without the decimal point which would follow the first digit. 


Values of the index for pairs drawn from the con- 
cept population vary from less than 1.3 for highly 
Similar items to more than 120 for highly dis- 
similar words. Norman (1959) found that D 
values for comparisons of mean profiles for the 
same concept obtained from two samples of judges 
varied from 1.28 to 292. He interpreted the data 
as suggesting that D values less than 2.00 should 
not be regarded as evidence that two concepts 
differ in measured connotative meaning. Norman 
also found that between-concept D values cor- 
related .92 in two samples of judges, indicating a 
high degree of stability for relative values of D 
obtained in comparisons of mean profiles, 

Four clusters of eight words with uniformly 
high intracluster connotative similarity were de- 
rived from the 360-word population by an arbi- 
trary iterative procedure. Table 1 shows the 
clusters, the intracluster D values for each pair, 
the average intracluster D for each word and the 
rank of this average, and the mean intracluster D. 

Table 2 shows the elements used in the learning 
and test phases of a typical experiment. Two 
eight-word clusters, termed "experimental words" 
and designated, for example, 4 and В were used 
in each experiment, The prime and double prime 
(eg, A’, A") indicate subclusters formed by 
assigning words ranking 1, 3, 5, and 7 in each 
eight-word cluster to the learning list and the 
remainder to the test list. The two test list sub- 
clusters, eight words in all, served as generalization 
stimuli and are termed G words. Meaning- 
determined stimulus generalization was expected 
to occur between each learning list subcluster 
and its test list counterpart in a manner analogous 
to that shown in Figure 2, except that the con- 
jditioning of an instrumental response to the 
mediator would presumably be quadruply rein- 
forced by the four related L list words, and 
manifested upon presentation of each of the four 
orresponding G words. 

The eight control (C) words provide a base- 
ine for determining response to G words in excess 
Í that expected on the basis of chance or factors 
ther than the hypothesized generalization. The 
words were selected for two properties: (a) 
bsence of connotative similarity in relation to 


+ The procedure consisted of successive applica- 
ion of a set of rules devised to locate words 
hich, when added to an initial cluster of size two, 
ould yield clusters of the desired size with a 
inimum intracluster D. Consideration was given 
O obtaining two clusters each from opposite ends 
f the major factorial dimension, evaluation. The 
xperimenter applied the rules to tabular data 
furnished by Jenkins, Russell and Suci (1959) 
n which words were identified by number only, 
liminating the possibility of a subjective factor 
in selection of "similar" words and insuring equal 
ikelihood of inclusion of any word with the desired 
juantitative properties, 
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TABLE 2 


COMPONENTS OF TYPICAL EXPERIMENT 


Learning (L) list Test (T) list 


Cluster A’ words 4 
Cluster B’ words 4 
Filler (F) words 4 


12 


Cluster A” words 4 Суда 
Cluster B" words 4 [f~ Words 
Control (C) words 8 


L list words 12 


28 


all 16 experimental words. Absence of similarity 
was defined by D > 6.005 for each C word in 
relation to every experimental word. (b) Com- 
parability with the eight G words in Thorndike- 
Lorge (1944): usage frequency. 

ЕШег (F) words were used to expand the 
learning list to reasonable length and to reduce 
the chance of conscious recognition of the mean- 
ingfully related clusters. The F words were 
selected for D = 6.00 in relation to all C words, 
to reduce the possibility of augmentation of 
response to the latter by generalization from the 
F words. 

Each L list was arranged in three sequences 
So that each item occurred once in each third 
and no more than two words of the same class 
(A', В! or F, for example) occurred in sequence. 
Six different random orders of the T list were 
prepared. 


Subjects 


Twenty male undergraduate psychology students 
were subjects in each experiment. The subjects 
received two points final examination credit for 
participating. 


Apparatus and Procedure 


Each subject was tested individually. The sub- 
ject was told the experiment concerned memory 
for words, recognition time, and reaction speed. 
One order of the L list was read aloud, one 
word per second, and then the other two orders 
were presented visually at the same rate on a 


5 Since stimulus generalization has been demon- 
strated for antonyms (Cofer, Janis, & Rowell, 
1943), very large semantic distances of C words 
in relation to L list words were avoided insofar as 
possible. Any failure to avoid the antonym-based 
stimulus generalization to C words is of course a 
conservative error with respect to the use of re- 
sponse to C words as a baseline for determining. 
generalized response to G words. 
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standard Hull-type memory drum. During the 
visual presentation, the subject grasped a lever 
movable in a 60° arc and was instructed to press 
the lever through the full arc upon viewing each 
word. Before both auditory and visual presenta- 
tions, the subject was told to try to remember the 
words so as to be able to respond to them later. 


The six orders of the T list were next presented 
visually at the rate of one word per second. The 
subject was told to press the lever as quickly as 
possible in response to each learning list word he 
recognized or believed he recognized. The move- 
ments of the lever were relayed to a Sanborn 
recorder set for sensitivity to even slight pres- 
sures. Response amplitude was recorded linearly. 
Preliminary experimentation indicated no consist- 
ent amplitude differences for any of the stimulus 
classes, and the records were subsequently scored 
on the basis of presence or absence of response 
of any magnitude to each T list word. 


Statistical Treatment of Response Data 


- The total number of responses of the 20 sub- 
jects to each T list word was computed and 
divided by the maximum total of 120 (six repeti- 
tions of each word for 20 subjects) to give a 
percentage response value for each word. In 
addition, four measures of generalization were 
computed for each subject: (a) total generaliza- 
tion, Ст, determined by subtracting total C word 
response from total G word response; (b and c) 
subcluster generalization, computed by subtracting 
total C word response from twice the total level 
of response to a four-word subcluster (eg. 
Ga = 24" — C); and (d) differential subcluster 
response Dg (e.g. total A” response minus total 
B" response). Means and variances of each of 
these measures were computed, and the statistical 
significance of the deviation of each mean from 
zero was evaluated by a two-tailed ¢ test with 
19 df. A test of the significance of response to 
individual G words is described later. 


'Two GENERALIZATION EXPERIMENTS 


Experiment 1 


Clusters A and В were used in the first 
experiment in the manner illustrated in 
Table 2. Table 3 shows the words used in 
the experiment and their assignment to the 
L and T lists. Percentage of possible re- 
sponse for each word on the test trial is 
also shown. The percentage response by 
stimulus class for the crucial T list words 
was 29.6 for 4” words, 14.0 for B" words. 
and 7.5 for C words. The average total 
generalization value (Gr) was 6.58 


(p < .001). The average generalization 
value for A” words (Сл) was 9.95 (p^ 
.001). Average generalization for B" word 
(Gs) was 3.10 (p < .01) and the ave 
differential generalization value for the 
subclusters of С words (Ds) was 34 
(p < 01). É 

The hypothesis of stimulus generalizatiot 
to T list words connotatively similar to 
original words was confirmed by the dat 
of the first experiment. Two equally e 
dent findings had not been expected: th 
two subclusters of G words differ signi 
cantly in response frequency, and individual 
G words vary widely in response frequency 
Nothing in the generalization hypothesis 6 
in the procedures for selecting the stimul 
indicated that one cluster should yield mo 
generalization than another, or that mom 
than random variation should occur in re 
sponses to individual G words. 


TABLE 3 


PERCENTAGE RESPONSE TO INDIVIDUAL WORDS; 
TEST TRIAL, EXPERIMENT 1 


L list % Words unique 
words response| {о T list respon! 
PU A" 
BODICE 99 CHARMING 51 
LOVELY 80 GIRL 32 
PRETTY 78 FRAGRANT 25 
WOMAN 78 SONG 04 
All A” words 
p p" 
SCALDING 96 FRIGHTFUL 28 
SCORCHING 96 PAIN 12 
MAD 76 DANGER 09) 
TROUBLE 50 HURT 
All B" words 140 
F Control 
SUCCESS 96 FAT 1 
CAR 72 PLAIN 18. 
FOOD 54 SLOW 10] 
AFRAID 47 SNAIL 071! 
SQUARE 0 
TRUNK 0 
DIM 
LONG 


АП C words 
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Examination of the available data for the 
individual stimuli failed to yield an explana- 
tion of these phenomena. Cluster В is 
slightly less connotatively similar internally 
than Cluster A (Table 1), but the difference 
is very small, and does not seem to plausibly 
account for the large difference in generali- 
zation to A” and B". The F words are 
slightly more closely related. connotatively 
to Cluster 4 than to Cluster B (the D means 
for the two 4 X 8 word comparisons are 
7.15 and 8.10, respectively) but the differ- 
ence is again small, and the absolute values 
of D are so large as to question reference 
to "similarity" of F words and experi- 
mental words. The average Thorndike- 
Lorge frequency of the A” words is 
slightly Jess’ than the average for the B” 
words.’ The small intracluster differences in 
average D values for individual words and 
the Thorndike-Lorge frequencies do not 
relate systematically to the differences in 
individual word response frequency. 


Experiment 2 


Experiment 2 was performed to replicate 
Experiment 1 for subjects and verbal ma- 
terials, in order to determine the stability 
of the main generalization effect and of the 
Observed variability in response to sub- 
clusters and individual words, Clusters M 
and.N were assigned as experimental words. 
A. new. set of control words and four new 
filler words balanced closely for similarity 
to the M and N clusters were selected. The 
components and response percentages for 
Experiment 2 are shown in Table 4. 

Subcluster N" received 35.4% of the pos- 
sible response, Subcluster M" 19.2%, and 
the C words 9.4%. Съ = 860 (p < .001) ; 
ax = 470 (р < 01); Gy = 1245 (p < 
001) ; De = 3.90 (p < .001). 

The main generalization hypothesis was 
iain confirmed. The “cluster predomi- 
lance" effect seen in the first experiment 
ippeared again, and although the data were 
omewhat more encouraging as to uni- 
ormity of individual С word response, two 
nembers of M" fall far below the other G 
vords in response frequency. 


TABLE 4 


PERCENTAGE RESPONSE TO INDIVIDUAL WORDS: 
Test TRIAL, EXPERIMENT 2 


L list % Words unique % 
words response] to T list response 
Mw м" 
PATRIOT 98: EFFORT 38 
PROGRESS 82 INCOME 21 
MONEY 67 UP 10 
STUDY 57 LIFT 07 
All M" words 19.2 
N м" 
BAD 90 HATE 41 
STEAL 71 THIEF 38 
NASTY 71 CRIMINAL 32 
CROOKED 60 FRAUD 30 
All N” words 354 
F Control 
ABRUPT 85 SLEEP 17 
FEAR 82 STOUT 12 
BLOCK 73 DUSKY 11 
CITY 12 PIG 08 
ROUND 08 
CALM 08 
MOLD 06 
SOFT 05 
All C words 9.4 


Response to Individual G Words 


Inspection of the data from the first two 
experiments showed that G word response 
frequencies tended to fall in two “clumps” 
rather than in one highly variable distribu- 
tion. The G words seemed to "work" or 
"not work" as generalization stimuli. A test 
of this possibility was carried out as fol- 
lows. The distributions of frequency of 
response to the control words in the two 
experiments did not differ significantly in 
means or variances, and these distributions 
were combined to form a single distribution 
of frequencies of response to control words 
with N — 16. The mean and standard devi- 
ation of this pooled distribution were taken 
as estimates of the parameters of a popula- 
tion of control word response frequencies. 
None of the six G words in Experiments 1 
and 2 with the lowest response frequencies 
(SONG, HURT, LIFT, DANGER, UP, and PAIN) 
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have response frequencies which deviate 
more than one standard deviation from the 
mean of this pooled C word distribution. In 
contrast, all of the remaining 10 G words 
fall more than 3.5 standard deviations above 
the mean of the pooled C word distribution, 
a deviation expected less than 4 times in 
10,000 in cases drawn from a population 
with parameters defined by the C word 
distribution. Thus, the first group of G 
words is essentially indistinguishable from 
the C words in response frequency, while 
the second group is clearly distinguishable 
from the C words in this respect. The first 
group of 6 G words may be considered 
“nonfunctional” with respect to generaliza- 
tion; the other 10 G words appear clearly 
“functional” in this respect. 

Confirmation of the hypothesis that only 
some of the G words functioned as generali- 
zation stimuli directed attention to the 
possibility that associative habits not re- 
flected in the connotative meaning measure- 
ment might account for some or all of the 
obtained generalization. 


ASSOCIATIVE RELATIONSHIPS OF THE 
GENERALIZATION STIMULI 


Generalization between words related by 
previously established associative habits has 
been repeatedly demonstrated. Investigators 
have assumed pre-existing verbal associa- 
tions of the form (X — Y) from the fre- 
quent occurrence of Y as a response to X 
in normative studies of free association 
(Kent & Rosanoff, 1910; Russell & Jenkins, 
1954). Ryan (1957) found that training 
with a sertes of paired verbal items X — А 
facilitated learning of a second series Y — A 
when X — Y associations were frequent in 
the Minnesota (Russell & Jenkins, 1954) 
norms. Bastian (1957) found that high 
frequency X —> Y associations inferred 
from the same norms facilitated learning 
of А — Y pairs when 4 — X pairs had 
been previously learned. Relearning of the 
4 — X pairs following the interpolated 
task was also facilitated by X — Y associ- 
ations. McClelland and Heath (1943) 
found interference in the relearning of 
X — A pairs when a Y > B task was 


interpolated and X — Y associations were 
frequent in the O'Connor (1928) norms. 
Mediated generalization of the form 
X ә (Y) ә B was assumed to retard 
relearning of the correct X — A pairs. 
Mink's (1957) extensive study of stimu- 
lus generalization between unidirectional 
verbal associates (X elicits У associatively, 
but Y does not elicit X) is particularly 
pertinent. Generalization of lever-press re- 
sponses from X words used in the learning 
list to Y words used in the test list was 
demonstrated in four experiments. The 
generalization occurred under varying rates 
of presentation, differing list arrangements, 
and different strengths of the association 
variable. A regular relationship between 
association strength and amount of general- 
ization was observed for associates of 
medium and high frequencies, although nc 
generalization occurred between X > Y 
pairs with less than 30% normative asso- 
ciation frequency. In accounting for his 
findings, Mink assumed an implicit occur- 
rence of a response like pronouncing the 
Y word at the time of the (X — lever 
press) learning trials. This was inferred 
on the basis of the tendency of Y to occur 
associatively to X. If this were so, some 
Strength would presumably then accrue to 
habits of the form (Y — lever press), and 
result in response to Y when it 2d] 
on the generalization trial. Figure 3 shows | 
these relationships, which are termed mi 


Mink paradigm. X and Y are unidirectional 
associates as described above, (Y) is the 
hypothesized implicit occurrence of the Y 
response, and R, is the lever-press response. | 


Learning list "Test list | 
X umen e Y —————R, | 
(Y) 


Fic. 3. Mink paradigm for associatively-deter- 
mined generalization, 


Mink also conducted four experiments in | 
which the X and Y words were deployed à 
in what he termed the Shipley-Lumsdaine і 
mediation paradigm (Lumsdaine, 1939). 


| 
| 
i 
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Figure 4 illustrates this arrangement and 
the expected generalization of the lever re- 
sponse from Y to X. Contrary to expecta- 
tion, and in spite of several variations in 
procedure, there was no generalization to 
the X words. Mink concluded that direction 
as well as strength of association is crucial 
in generalization mediated by associative 
habits. 


Learning list Test list 


y——>R, X——o(Y)——R, 


Fic. 4. Shipley-Lumsdaine paradigm for asso- 
ciatively-determined generalization, 


The clear evidence for stimulus generali- 
zation to the Y term of an X — Y associa- 
tive pair in the Ryan and Mink studies 
suggested that associative relationships 
might account for the variation in G word 
response in the present study. If G word 
response varied with the strength of asso- 
ciative relationships between G words and 
L list words, connotative similarity and 
association might be assumed to have func- 
tioned as dual determinants of the observed 
generalization. If substantial associative 
relationships were found between L list 
words and the "functional" group of G 
words but none were found between L list 
words and the “nonfunctional” group of 
G words, the associative factor might 
plausibly be considered the major deter- 
minant of the observed generalization. 

\ Since the Minnesota norms contain asso- 
ciative data for only a small part of the 
words used in the present experiments, free 
association data for these words were ob- 
tained from a sample of 128 college stu- 
dents. The format and instructions of the 
association study were identical to those 
used in the Minnesota study. The order of 
presentation of stimuli was rotated with 
respect to classification in the generalization 
studies: an A word, a В word, an Experi- 
ment 1 F word, an Experiment 1 C word, 
an M word, an N word, an Experiment 2 
F word, and an Experiment 2 C word. No 
» С word appeared in the list until all L list 
words had appeared. This was done to con- 
trol for the possibility that G words would 


occur as associative responses to L list 
words with a spuriously high frequency 
because of recency of prior occurrence as 
stimuli. (Storms—1956—found that recent 
occurrence of Y as a stimulus results in 
higher frequencies of Y as a free associa- 
tion response.) By necessity, this arrange- 
ment of the list is biased toward spuriously 
high frequencies of L list words as associa- 
tive responses to G words. 


Results 


L word — G word associations (Mink 
paradigm). Mink found associative habits 
between words were relevant to generaliza- 
tion only when the learning list words were 
the stimulus terms (X) and the test list 
words were the response terms (Y) of 
associative pairs (X — Y). Table 5 shows 
the associative data relevant to the possi- 
bility of Mink-paradigm associative gener- 
alization in Experiments 1 and 2. The 
left-hand columns show all L list words 
which associatively elicited the G word 
shown in the right-hand column, and the 
percentage of the 128 subjects who gave 
the G word as an associative response to 
each L list word. The percentages are not 
independent. G word GIRL, for instance, 
was the associative response to L list word 
pretty for 23% of the subjects, the re- 
sponse to woman, for 12% of the subjects, 
etc., with no constraint against the same 
subject responding GIRL to more than one 
of these L list words. 

The data indicate that Mink-paradigm 
generalization was a negligible factor in 
the outcomes of Experiments 1 and 2. Even 
if the nonindependent response frequencies 
are cumulated, only one G word of the 16 
(GIRL) occurs frequently enough as an asso- 
ciative response of L list words to deter- 
mine generalization of the kind Mink 
observed. Most of the G words with high 
response frequencies in the generalization 
experiments have zero or very small fre- 
quencies as associative responses to L list 
words. 

G word — L list word associations 
(Shipley-Lumsdaine paradigm). Applica- 
tion of the Shipley-Lumsdaine paradigm of 


10 CHARLES F. DICKEN 


TABLE 5 k 


PERCENT INCIDENCE ОР С Worps As Associative Responses TO L List Worps 


(Associations Relevant to Mink Paradigm) 


Experiment 1 Experiment 2 
L list word(s) which elicit G word G word L list word(s) which elicit G word G word 
and percentage elicitation of as and percentage elicitation of as 
G word by L list word Response G word by L list word Response 
A” words M” words 
PRETTY 23.4 STUDY 8 EFFORT 
WOMAN" 11.7 
LOVELY 10.9 ron STEAL 8 LIFT 
BODICE. „> Re 4,7 
(No elicitor) INCOME 
(No elicitor) CHARMING 
tore. Sait -(No elicitor) UP 
(No elicitor FRAGRANT 
(No elicitor) SONG 
B" words AN" words 
TROUBLE 2.3 STEAL 1 ч i 
AFRAID is) DANGER CROOKED 3.9 Сар 
(No elicitor) FRIGHTFUL. BAD 1.6 
CROOKED 8 CRIMINAL, 
(No elicitor) PAIN STEAL 8 
MONEY 8 
(No elicitor) HURT 
FEAR 34 HATE 
(No elicitor) FRAUD 


association-determined generalization to the 
context of the current experiments requires 
associative habits of form (С word > L 
list word). ‘The application of the paradigm 
can be clarified by substituting Y, the 
notation adopted for the response term of 
ап associative pair for “Т, list word,” and 
X, the’ notation. for the stimulus term of 
an associative pair for *G word.” If asso- 
ciative habits determine generalization in 
the’ Shipley-Lumsdaine paradigm (Figure 
4), a lever response learned to an L list 
word (Y) should generalize to a.G word 
(X) when (С word > L-list: word) asso- 
ciations are of sufficient strength. . ` 


Table 6 shows the percentage of the 128 | 
subjects who responded associatively with | 
L list words when G words were used as 
stimuli. No control word elicited an L list 
word, and no F words were elicited. The | 
percentages for different L list words elic- 
ited by the same G word are independent, 
since each G word appeared only once and 
elicited a single response from each subject. 
Associative relationships of some G and 
L words in the (С word — L list word) 
direction are considerably greater than those 
in the reverse direction. This is particularly 
true if response percentages for the several 
L list words elicited by a single G word are 
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combined. Four of the 16 С words 
(CHARMING, GIRL, INCOME, THIEF) elicit 


9 Such cumulation is statistically legitimate in 
view of the independence of the percentages. 
Whether a set of relatively low-frequency associa- 
tions can combine to yield sufficient associative 
“strength” to determine generalization is not 
known. Such combination seems logical, however, 
since it is a matter of indifference in the present 
context which L list word is an associate of a 
G word, and since the “strength” of a given 
associative pair is by definition maximal in any 
individual subject for whom X associatively elicits 
Y, irrespective of the infrequency of (X — Y) in 
group data. 


one or another L list word in more than 
2096 of the subjects. Two more G words 
(CRIMINAL, FRAUD) elicit some L list word 
in more than 10% of the subjects. АП 6 
of these G words fall in the "functional" 
category in regard to generalization. If 
the 6 “nonfunctional” G words are com- 
pared with the 10 "functional" С words, 
frequency of (С word — L list word) 
associations is significantly greater for the 
latter at the .05 confidence level.’ 


7 Mann-Whitney U test, two-tailed. 


TABLE 6 


PERCENT INCIDENCE oF L List Worps As Associative RESPONSES TO С Worps 


(Associations Relevant to Shipley-Lumsdaine Paradigm) 


Experiment 1 Experiment 2 
G word L list word as response to G word G word L list word as response to G word 
as and percentage elicitation of as and percentage elicitation of 
Stimulus L list word Stimulus L list word 
A" words M” words 
WOMAN 12.5 INCOME MONEY 55.4 
CHARMING PRETTY 7.0 
LOVELY 54 EFFORT PROGRESS 8 
GIRL WOMAN 12.5 UP (None) 
PRETTY 10.2 
LIFT (None) 
SONG PRETTY 
FRAGRANT (None) 
B" words 
= 
AFRAID STEAL 33.6 
FRIGHTFUL WOMAN PAD. e 
LOVELY MONEY 1.6 
AFRAID 8 
8 
DANGER | = cama VP DTE 
| ВАР 10.9 
HURT (None) STEAL 1.6 
РАП STEAL 94 
^ psy BAD 3.9 
MONEY 1.6 
STUDY 8 
(None) 
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Three factors must be considered in 
evaluating the possibility that (С word — L 
word) associations — (Shipley-Lumsdaine 
paradigm) have a relationship to the gen- 
eralization frequencies of the G words. 


1. The fact that the observed associative 
frequencies may be spuriously elevated by 
the recency factor (Storms, 1956) may 
account for the higher association values in 
Table 6 compared with those in Table 5. 
This could vitiate an interpretation of С 
word response in terms of association- 
determined generalization. The differences 
in observed association values of the func- 
tional and nonfunctional G words in Table 
6 cannot, however, be accounted for by the 
recency factor. 


2. The possibility of spurious associative 
frequencies due to recency, does, however, 
contribute to the cogency of a second con- 
sideration. The association frequencies of 
(G word — L word) relationships are with 
two exceptions below the level Mink (1957) 
found necessary for generalization. This is 
true even when response percentages of 
different L list words are cumulated. If 
the observed (С word —> L word) associa- 
tive frequencies are spuriously high because 
of the recency factor, the possibility that 
any of the observed generalization can be 
attributed to associative factors is, of 
course, further reduced. In any instance, it 
is clear that the associative frequencies are 
not sufficient to explain all of the gener- 
alization, since several of the functional G 
words do not associatively elicit L list 
words. 

3. The most important consideration is 
Mink’s carefully replicated finding that 
generalization does not occur on the basis 
of associative relationships employed in the 
Shipley-Lumsdaine paradigm. Unless later 
evidence reverses Mink’s findings, it does 
not appear plausible to attribute the differ- 
ences in G word generalization in the pres- 
ent experiments to associative habits in the 
(G word > L word) direction. 

The absence of associative relationships 
appropriate to the Mink paradigm and the 
above considerations concerning associative 
relationships relevant to the  Shipley- 


Lumsdaine paradigm suggest that the asso: 
ciation data do not account for the differences, 
in С word response in Experiments 1 and 
2. The observed relationship between ( 
word — L word) associations and generali- 
zation do, however, indicate the value of 
further investigation of the Shipley-Lums- 
daine associative mediation paradigm and 
the need for control of the recency factor 
in gathering associative data for clarification 
of the generalization problem. f 


Tue Errect oF CHANGE OF 
VERBAL CONTEXT 


The outcome of Experiments 1 and 2 
suggested two possible relationships between 
the cluster predominance phenomenon and 
the wide variability in response to individ- 
ual G words. 


1. Predominance of one cluster over the’ 
other in the subject's attention during learn- ( 
ing and/or generalization might be intrinsic 
to the two-cluster design. If this were so, 
variation in response to individual G words 
might be in part attributable to “suppres- 
sion” of response to the G words in the 
subordinated cluster. Five of the six non- 
functional G words in the first two experi- 
ments were from the subordinated clusters, 
Variability in response to individual G 
words within the same cluster is not, of 
course, explained by this hypothesis. 


2. The cluster predominance phenomenon 
may be an artifact of differences in response 
potentialities of individual G words. G 
words which are nonfunctional due to un- 
known factors not controlled in the deter- 
mination of connotative similarity could 
have accumulated by chance in dispropor- 
tionate numbers in the two “weak” sub- 
clusters. The between-cluster differences in 
generalization might reflect only differences, 
in individual word response potentiality. — 


If Hypothesis 1 were correct, the pre- 
dominance effect should occur in any two= 
cluster experiment, regardless of the re- 
sponse potentialities of individual G words 
Tf two clusters containing apparently potent 
G words (for example, the two domina | 


/ 
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clusters 4 and N) were paired in the same 
experiment, response to one subcluster of 
G words should be substantially less than 
response to the other. Individual stimuli 
in the “suppressed” cluster would elicit less 
response than before, and some G words 
might be expected to shift from functional 
to nonfunctional status. If two previously 
subordinate clusters were paired in an ex- 
periment, one of the clusters would be 
expected to become dominant, with an in- 
crease in individual G word response and 
some shift from nonfunctional to functional 
status of G words. 


If Hypothesis 2 were correct, no cluster 
difference effect would be expected when 
two subclusters of potent G words appear 
in the same experiment. Pairing of two 
previously weak clusters would similarly 
yield little or no cluster difference in gen- 
eralization. Response frequencies of indi- 
vidual G words would be expected to remain 
at approximately the same levels observed 
in the original experiments. 

Experiments 3 and 4 were designed to 
evaluate these alternative explanations of 
the cluster effect. The two previously 
dominant clusters (4 and N) were paired 
in Experiment 3, and the two previously 
subordinate clusters (B and M) were used 
together in Experiment 4. The Experiment 
1 control words were used in Experiment 3, 
since all D values for control word- 
experimental word connotative comparisons 
exceeded the previously adopted standard 
of 6.00. The Experiment 2 control words 
met the same standard with respect to all 
Experiment 4 experimental words, and were 
used as control words in Experiment 4. 
Four filler words were selected for each 
experiment so that neither experimental 
cluster was favored in connotative simi- 
larity, Some F words were the same used 
in the earlier experiments, and one was used 
for both Experiments 3 and 4. The verbal 
items used in the two new experiments and 
the altered context of the experimental 
clusters can be seen in Tables 7 and 8. Pro- 
cedures for conducting the experiments 
were identical to those of the first two 
experiments. 


Results 


Experiment 3. Table 7 shows the out- 
come of Experiment 3. Response to A” 
words was 14.2% of the total possible 
response, in contrast to the 29.6% response 
to the same words in Experiment 1. Re- 
sponse to N” words was 15.4%, in contrast 
to 35.4% response to these words in Experi- 
ment 2. Control word response was 42% 
in Experiment 3, compared to 7.5% re- 
sponse to the same words in Experiment 1. 
Gr was 5.10 (p < .001); б, was 4.80 
(p < 001); Gy was 540 (p < .001); 
Dg was .30 (р > .10). 

Experiment 4. Table 8 shows the out- 
come of Experiment 4. Response to B" 
words was 20.096 of the total possible re- 
sponse, in contrast to 14.0% response to 
these words in Experiment 1. Response to 
M" words was 8.1%, in contrast to 19.2% 


TABLE 7 


PERCENTAGE RESPONSE TO INDIVIDUAL WORDS: 
Test TRIAL, EXPERIMENT 3 


L list % | Words unique 96 
words response to T list response 
Al А” 
WOMAN 97 GIRL 25 
BODICE 93 CHARMING 19 
PRETTY 77 FRAGRANT 11 
LOVELY 67 SONG 2 
АП A" words 14.2 
м м" 
ВАР 72 НАТЕ 27 
NASTY 61 THIEF 21 
STEAL 50 CRIMINAL 9 
CROOKED 42 FRAUD 5 
All N” words 154 
F Control 
CITY 82 FAT 14 
ART 15 PLAIN 5 
RAPID 70 SLOW 4 
AFRAID 46 SNAIL 3 
DIM 3 
TRUNK 2 
SQUARE 1 
LONG 0 
АП С words 42 
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TABLE 8 


PERCENTAGE RESPONSE TO INDIVIDUAL WORDS: 
Test TRIAL, EXPERIMENT 4 


L list % Words unique % 
words response} to T list response 
Bp в” 
SCORCHING 98 DANGER 27 
SCALDING 96 FRIGHTFUL 22 
MAD 79 PAIN 20 
TROUBLE 65 HURT 11 
All B” words 200 
"a м" 
PATRIOT 86 EFFORT 12 
MONEY 78 INCOME 11 
PROGRESS 75 LIFT 5 
STUDY 41 UP 4 
All M" words 8.1 
F Control 
BOX 72 MOLD 14 
BLOCK 66 STOUT 7 
AFRAID 44 SLEEP 5 
LIGHT 35 CALM 2 
PIG 2 
SOFT 2 
ROUND Ш 
DUSKY 1 
All C words 41 


response to the same words in Experiment 
2. Control word response was 4.1% in 
Experiment 4, compared to 9.4% response 
for the same words in Experiment 2. Gr 
for Experiment 4 was 4.80 (p < .001); 
Ов was 7.75 (p « .001); Gy was 2.10 
( < .05); Da was 285 (p < .01). 

The data again confirm the main gen- 
eralization effect. All four clusters of G 
words demonstrated significant generaliza- 
tion. Response for many individual G 
words is substantially greater than response 
to control words. However, the most strik- 
ing result of the context-change experiments 
was the marked shift in response levels of 
individual G words, G word clusters, and 
control words, relative to the levels elicited 
by the same stimuli in the initial experi- 
ments. Three of the four G word clusters 
elicited less than half the response frequen- 
cies observed earlier, while response to the 
remaining cluster increased. One set of 


control words elicited less than half the 
previous response, and the other set evoked 
only slightly more than half. Pronounced 
shifts in response to some individual G 
words can be seen by comparing Tables 3 
and 4 with Tables 7 and 8. 

The decrease in response frequencies for 
all classes of test list words was without 
apparent explanation. Change of context 
had been presumed possibly relevant to 
change in relative amounts of response for 
clusters of G words, but had not been ex- 
pected to reduce the response to both 
clusters, as occurred in Experiment 3, or to 
materially affect response to the control 
words. 

The possibility of some kind of reduction 
of the géneral activity level of the subjects 
was considered: The subjects for Experi- 
ments 3 and 4 were drawn from the same 
psychology classes as those used in the 
initial experiments. They were motivated 
to participate by the same incentive of two 
points final examination credit, Since the 
summer weather was somewhat uncomfort- 
able, temperature and humidity data for the 
times of the two sets of experiments were 
examined, but no differences were found. 
An hypothesis of lowered activity level is 
in addition inconsistent with two findings: 
response to one group of G words (В”) 
increased in the second set of experiments, 
and response to L list words on the T list 
remained relatively constant in the two sets 
of experiments.* 


Response to Individual G Words 


A test of the functional or nonfunctional 
status of the G words as elicitors of gener- 
alized response was conducted for the data 
of Experiments 3 and 4 in the same manner 
as before. The distributions of response 
frequencies for the control words in the 
two context-change experiments were com- 
pared and pooled when no significant differ- 
ences in means or variances were found. 


5 Опе group of L list words (N^) decreased 
in response frequencies, but the other three groups 
CA', B', and М”) evoked comparable response 
levels in both the initial and context-change 
experiments. 


} 
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G words with response frequencies greater 
than that of the most popular control word 
were tentatively considered “functional,” the 
remaining G words “nonfunctional.” The 
least popular functional G word exceeds 
the control word mean by 3.47 sigma, a 
deviation expected 1 time in 1,000 in a 
distribution with the parameters of the 
control word respónse distribution. The 
most popular nonfuhctional G word exceeds 
the control word mean by only 1.93 sigma, 
a deviation which is not statistically signifi- 
cant at the .05 level. Classification of G 
words as functional or nonfunctional gen- 
eralization stimuli in the context-change 
experiments thus appears satisfactory, al- 
though it is not quite as distinct as the 
classification on the basis of the ‘data from 
the original experiments. Table 9 shows the 
classifications of the 16 G words in the two 
contexts. Nine words retain their previous 
status in the altered context, but seven 
change classification. 

Stability of the functional status of more 
than half of the G words in the new context 
appears to confirm the hypothesis of stable 
differences in the response potentialities of 
at least some of the individual generaliza- 
tion stimuli. This tends to substantiate 
Hypothesis 2 above. The continued wide 


TABLE 9 


Comparison or FUNCTIONAL Status OF G WORDs: 
EXPERIMENTS 1 AND 2 Vs. EXPERIMENTS 3 AND 4 


Experiment 1 or 2 
Experiment 
3or4 
Functional Nonfunctional 
(A") GIRL (B") DANGER 
(A") CHARMING | (B") PAIN 
Functional (B") FRIGHTFUL 
(N") THIEF 
(N") HATE 
(A") FRAGRANT | (A") SONG 
(M") EFFORT (B") HURT 
Nonfunctional | (M") INCOME (M^) LIFT 
(N”) CRIMINAL (M") ue 
(N") FRAUD 


variability of response to G words within 
subclusters also tends to confirm Hypothesis 
2. Although one cluster of G words yielded 
significantly more generalization than the 
other in Experiment 4, the cluster effect 
fails in Experiment 3, a finding more con- 
sistent with Hypothesis 2 than with Hy- 
pothesis 1. However, in spite of these 
trends in the data, the unstable response 
levels for clusters as well as for individual 
words indicated there was no unequivocal 
choice between the alternatives which Ex- 
periments 3 and 4 had been designed to 
resolve. Context change appeared to be a 
factor which introduced a new component 
of variability in the response frequencies 
for some of the G words, but the unex- 
plained and large variation in over-all 
response levels suggested that unreliability 
of the experimental procedures might ac- 
count for the changes. Confirmation of the 
role of context change in the variability of 
individual word response frequencies and 
resolution of the interpretation of the 
cluster phenomenon appeared to require 
further verification of the reliability of the 
relative differences in response frequencies 
of individual generalization stimuli. 


REPLICATION OF THE INITIAL EXPERIMENTS 


Exact replications of the initial generali- 
zation experiments were conducted to deter- 
mine the reliability of the absolute and 
relative response frequencies for clusters of 
generalization stimuli and for individual G 
words. Experiment 5 duplicated the ma- 
terials, list sequences, and procedures of 
Experiment 1; Experiment 6 repeated Ex- 
periment 2. 


Results 


Tables 10 and 11 show the outcomes of 
the replication experiments. 


Experiment 5. Response to А” words 
was 18.396 of the total possible response, 
in contrast to 29.6% in the first experiment 
and 142% in the context-change experi- 
ment. Response to B" words was 12.196 
in replication, in contrast to 14.0% in the 
first experiment and 20.0% in the changed 


16 
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PERCENTAGE RESPONSE TO INDIVIDUAL WORDS: 
Test TRIAL, EXPERIMENT 5 


L list % Words unique % 
words response| to T list response 
A A" 
BODICE 100 CHARMING 35 
LOVELY 76 GIRL 18 
WOMAN 76 FRAGRANT 16 
PRETTY 62 SONG 04 
All A” words 18.3 
B' B" 
SCORCHING 87 FRIGHTFUL 19 
SCALDING 82 DANGER 14 
MAD 77 PAIN 11 
TROUBLE 56 HURT 04 
All B" words 12.1 
F Control 
SUCCESS 98 SQUARE 09 
FOOD 72 TRUNK 05 
CAR 69. PLAIN 04 
AFRAID 51 DIM 03 
LONG 02 
FAT 02 
SNAIL 02 
SLOW 01 
All C words 3.6 


context. The control words yielded 3.6% 
response, compared to the 7.5% response 
observed in the first experiment and the 
4.2% in the third experiment. Gy for Ex- 
periment 5 is 5.55 (р < .001). G, is 7.05 
(p < .001) and Gs is 4.05 (р < .02). The 
mean cluster difference score Dg is 1.50 


($ > .05). 


Experiment 6. Response to M” words 
was 17.1% of the total possible response, 
in contrast to 19.2% response to these words 
in the second experiment and 8.1% response 
in the context-change experiment. Response 
to N” words was 22.9% in replication, 
compared to 35.4% in the second experi- 
ment and 15.4% in the changed context. 
Control words received 5.2% response in 
Experiment 6, compared to 9.4% in Ex- 
periment 2 and 4.1% in Experiment 4. б, 
for Experiment 6 is 7.10 (p < .001). 
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Gy is 5.70 (p « .001), and Gy is 8.50 
(p < .001). Dg is 1.40 (р > .05). 


Response to Individual G Words 


The distributions of response frequencies 
of the two sets of control words in the 
replication experiments were compared, 
found not significantly different in means 
or variances, and pooled. Response fre- 
quencies of 4 of the 16 G words (SONG, 
HURT, UP, LIFT) fall within the range of 
the pooled control word distribution and 
deviate less than one sigma from the con- 
trol word mean. These four words are 
consistently nonfunctional in all three sets 
of experiments. G word ParN, classified 
nonfunctional in the initial experiment but 
functional in the changed context falls out- 
side the pooled control word distribution in 
the replication experiment and deviates 
from the control word mean by 2.22 sigma. 


TABLE 11 


PERCENTAGE RESPONSE TO INDIVIDUAL WORDS: 
TEsT TRIAL, EXPERIMENT 6 


L list % Words unique 96 
words response! to T list response 
м м" 
PATRIOT 95 EFFORT 30 
PROGRESS 72 INCOME 29 
STUDY 67 UP 07 
MONEY 67 LIFT 02 
All M" words 17.1 
N N" 
BAD 74 HATE 27 
STEAL 61 THIEF 24 
CROOKED 57 FRAUD 21 
NASTY 54 CRIMINAL 19 
All N” words 22.9 
F Control 
ABRUPT 88 STOUT 09 
CITY 72 SOFT 08 
FEAR 71 ROUND 07 
BLOCK 61 CALM 07 
MOLD 04 
SLEEP 03 
PIG 02 
DUSKY 0 
All C words §.2 


| 
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This word thus appears of borderline func- 
tional status in the replication experiment, 
and may be classified nonfunctional if the 
01 level of significance is adopted. The 
remaining 11 G words deviate from the 
control word mean by more than three 
sigma units, and are clearly classifiable as 
functional in the replication experiments. 


COMPARISON OF INITIAL, REPLICATION, 
AND CoNTEXT-CHANGE DATA 


Table 12 summarizes the G word data 
for the three sets of experiments. Response 
percentages for each G word in initial (I), 
context-change (X), and replication (R) 
experiments, and percentage response in all 
experiments (T) are shown in the first 
column. Column 2 shows the functional 
status of each G word in each experiment. 
Data in the remaining columns are discussed 
in the following section. 

Several conclusions can be drawn from 
comparison of the response data from the 
three sets of experiments: 


1. The experimental method reliably dem- 
onstrates the hypothesized connotatively- 
determined generalization for clusters of 
generalization stimuli. Total response to 
G words and response to subclusters of G 
words significantly exceeds control word re- 
sponse in every instance in the six experi- 
ments. 

2. Differences in relative levels of re- 
sponse for individual G words in a constant 
verbal context are reliably demonstrated. 
Depending on the choice of significance 
level for classification of the “borderline” 
word PAIN in the replication experiment, 
either 14 or 15 of the 16 G words are con- 
Sistently classified in the initial and replica- 
tion studies as to whether they function or 
fail to function as generalization stimuli. 
The product-moment correlation between 
frequency of response to the 16 G words 
in the initial and replication experiments is 
89 (p < 01). 

3. The importance of the role of verbal 
context in determining response frequencies 
and functional status of some individual G 
words appears reliably established. Only 9 


of 16 G words are consistent in functional 
status in an altered verbal context, in con- 
trast to 14 or 15 in the exact replication 
study. The correlations of response levels 
of G words in the initial experiments and 
the context-change experiments is .41, and 
the value for the context-change data vs. 
the replication data is 43. Neither value 
is large enough for statistical significance, 
although the level of correlation suggests 
some stability of response level irrespective 
of context. 

4. Connotative meaning similarity to 
learning list words is always a determinant 
of generalization in the case of some stimuli, 
and never a determinant in the case of some 
other stimuli. Five G words are consistently 
functional (marked CF) in Table 12, 
Column 2, and four G words are consist- 
ently nonfunctional (marked CNF). 


5. The experimental method is consider- 
ably less reliable for the measurement of 
absolute levels of generalized response than 
for relative levels. The distinct drop in fre- 
quency of response to Subclusters A" and 
N" which occurred in the context-change 
experiments also occurs in the replication 
experiments. Response to control words 
was similarly much lower in both context- 
change and replication experiments than in 
Experiments 1 and 2. The significant 
cluster predominance effect which appears 
in Experiments 1, 2, and 4 does not appear 
in the other three experiments, and appears 
interpretable as an artifact of fluctuation 
of response levels for subclusters due to un- 
known factors. The distinctive shifts in 
absolute response levels to subclusters of G 
words and to individual G words indicate 
the necessity of control words as baselines 
for response frequency in verbal experi- 


mentation. 


UsacE FREQUENCY, MEANINGFULNESS, AND 
DENOTATION IN RELATION TO LEVEL 
or GENERALIZED RESPONSE 


The association study indicated that asso- 
ciative habits could not account for the 
differences in G word response in the initial 
experiments. Inspection of the association 
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data for the additional (L word — G word) 


_ and (С word — L word) associations of 
. possible: relevance in the context-change 


experiments? indicated no frequencies as 
great as 5%. The original association data 
and this additional data appear to quite 


_ definitely rule out Mink paradigm or 


Shipley-Lumsdaine paradigm associative 
generalization in the context-change experi- 
ments. Three additional variables which 
might account for differences їп G word 
response levels are considered in this 
section. 


Usage Frequency 


Column 3 of Table 12 shows the Thorn- 
dike-Lorge (1944) usage frequency values 
of the G words. Numbers refer to usage 
frequency per 1,000,000 words. A desig- 
nates words: with frequencies. between 50 
and 99 per-1,000,000, 4A words with fre- 
quencies of 100 or more per 1,000,000. The 
frequency values do not appear to sys- 


_ tematically relate to amount of generalized 


response or to the functional-nonfunctional 
classification. It may be noted. that no 
teally infrequent words were used. 


Meaningfulness 


Column 4, Table 12 shows connotative 
meaningfulness values of the G words, 
defined by 


1 : X 
©; м= у? dj? 


where i is the number of Semantic Differ- 
ential scales used in measuring the connota- 
tive meaning of a word and d; is the devi- 
ation of the scale value of a word from 
the midpoint of the scale. This index meas- 
ures the deviation of the Semantic Differ- 
ential profile of a word from that of a 
completely neutral or “connotatively mean- 


E 


? These are “Mink paradigm” associations of the 
form 4’ word —> N” word, N' word — A” 


. Word, В’ word — М” word, and M' word — В” 


E 
1 
be 


word, and “Shipley-Lumsdaine paradigm” associ- 
ations of the form N” word — A’ word, A” 
Word — N’ word, B" word — М" word, and 
M" word —- B! word. 
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ingless" stimulus. When M is large, scale 
values are extreme, and the stimulus may 
be considered vivid or powerful in conno- 
tation, Construct validity for M is sug- 
gested by its high correlation (.71) with 
Noble’s association-based | meaningfulness 
index (Jenkins & Russell, 1956). The 
functional potency of the mediation process 
TmSm associated with a word might plausi- 
bly be related to the value of M. If this 
were so, words with large M values should 
have a relatively great potentiality for elicit- 
ing an instrumental response (cf. Figure 1). 

The relationship between the M values of 
the G words and level of generalized re- 
sponse in the present experiments sub- 
stantiates these assumptions. The Pearsonian 
correlation of M with total generalized re- 
sponse for the 16 G words is .66 (№ < .01). 
There is no overlap of the M values of the 
five consistently functional and the four 
consistently nonfunctional G words. The 
mean M value for the CF words is 6.32, 
for the CNF words 5.13 (p < .01, t test). 
The data indicate that the meaningfulness 
factor is important in the observed differ- 
ences in generalization to individual stimuli. 


Judged Denotative Similarity 


Inspection of the G words with consist- 
ently low response frequencies suggested 
that some of these words are less similar 
denotatively to learning list words of the 
same connotative cluster than are the G 
words which yielded high generalization 
frequencies. A” word sone, for instance, 
which consistently failed to elicit generali- 
zation, does not seem to partake as much 
of a denotation which might be termed 
“femininity,” as do all of the more potent 
A" words (CHARMING, GIRL, FRAGRANT) 
and the learning list words of the A cluster 
(WOMAN, BODICE, PRETTY, CHARMING). 
Similarly, N” words up and LIFT, consist- 
ently nonfunctional, do not appear as 
denotatively similar to their learning list 
connotative counterparts (PROGRESS, PA- 
TRIOT, STUDY, MONEY) in terms of the 
denotation “social-value-related concept” as 
do the more successful N" generalization 
stimuli (EFFORT and INCOME). 
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То test the possibility that differences in 
denotative relationships relate to the ob- 
served differences in generalization, 24 male 
and 15 female college students were asked 
to judge the denotative similarity of each 
G word in relation to the four relevant, 
connotatively similar L list words. Appen- 
dix C illustrates the two kinds of forms on 
which the judgments were recorded. In 
Part I of the study, 16 pages like that shown 
in Appendix C-I were used. Subjects were 
asked to rank the four G words of a sub- 
cluster according to amount of denotative 
similarity to one of the revelant L list 
words. Denotation was defined at several 
points in the instructions as the “thing to 
which the word refers." Each set of G 
words was ranked against each of the four 
relevant L list words, although pages per- 
taining to Clusters 4, B, M, and N were 
rotated so that words from each cluster 
appeared only once in every four pages. 

Part II of the study used four pages like 
that shown in Appendix C-IL Four С 
words from a given cluster appeared below 
аз before, and the entire learning list sub- 
cluster relevant to the G words appeared 
above. Subjects were asked to consider the 
L list subcluster at the top of the page until 
they had an idea of the "thing" these words 
referred to in common, and then to rank 
the G words as to similarity in denotation 
to this common meaning. All 39 subjects 
made all 16 sets of rankings in Part I 
and all 4 sets in Part II, with the exception 
of a few omitted judgments. 

Although the subjects when questioned 
after the judgments indicated they had no 
difficulty in understanding the judgment 
task or the definitions of “denotation” and 
"denotative similarity," there is of course 
no guarantee that denotative similarity was 
in fact the quality being judged. Connota- 
tive similarity, word association, or other 
factors may have influenced the judgments. 
Accordingly, the data are termed “judged 
denotative similarity," and are limited by 
these considerations. 

Column 5 of Table 12 shows the propor- 
tion of Ranks 1 and 2 earned by each G 
word in all judgments made in Part І. 
Each proportion is determined by approxi- 


mately 156 judgments (39 subjects X 4 
judgments for each G word). The pro- 
portion is an index of the relative number 
of "similar" rankings of each G word in 
relation to the relevant L list words, The 
rank order correlation of the proportion of 
"similar" denotative judgments and the total 
amount of generalized response across all 
16 G words is .51 (p < .05). If denotation 
is an important determinant of generalized : 
response, as the correlation seems to indi- 
cate, the five CF words might be expected 
to receive predominantly (more than half) 
"similar" judgments, while the four CNF 
words should receive predominantly dis- 
similar judgments. The proportions are in 
the expected direction in six of nine cases, a 
trend consistent with the hypothesis but 
statistically insignificant. 

Column 6 of Table 12 shows the propor- 
tion of 1 and 2 ranks ("similar") earned 
by each G word in Part II of the judgment 
study. Each proportion is determined by 
39 judgments, since each G word was 
ranked against the relevant L list subcluster 
only once by each subject. These denotative 
judgments are more closely related to the 
amount of generalized response than the 
Part I judgments. Clusters A” and M" show 
perfect correspondence in intracluster rank 
orders of total generalized response fre- 
quency and proportion of similar judg- 
ments. The correspondence of these rankings 
in Cluster B" is as close as tied ranks on 
the denotative variable permit. No intra- 
cluster correspondence is observed, however, 
for Cluster N”. The rank order correla- 
Чоп of amount of generalized response 
and proportion of similar judgments across 
all 16 G words is .66 (p < .01). The 
tendency for more than half "similar" judg- 
ments for CF words and less than half 
"similar" judgments for CNF words is in 
the expected direction in eight of nine 
cases, a trend significant at the .05 level 
(sign test). 

Both kinds of judgment data appear to 
confirm the hypothesis that denotative mean- 
ing factors not measured by the Semantic 
Differential play an important role in the 
obtained generalization. One hypothesis 
suggested by the data is that some degree 
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of denotative similarity is a necessary con- 
dition for generalization. If it can be as- 
sumed that connotation and denotation are 
related, it may be that the Semantic Differ- 
ential measurements select words with the 
necessary denotative similarity in a majority 
of instances, but fail to do so in some cases. 

The Part II judgment data were obtained 
in a context most similar to that of the 


- experiments, in that a cluster of L words 


was judged in relation to a cluster of G 
words. The context factor appears to be 
one which requires control in judgments 
of this kind as well as in generalization 
proper, since the Part I judgment data, 
obtained in a somewhat different context, 
do not predict generalization frequencies 
as well as the Part II data. 

Another aspect of the hypothesis con- 
cerning the role of denotative similarity 
in generalization concerns the primary and 
secondary associates of the G words, These 
were determined by the most frequent and 
next most frequent response in the free 
association data, and are shown in Column 
7 of Table 12. The primary associate is 
first in each case. When there was no 
secondary associate with a response fre- 
quency of 5% or greater, none is shown. 
The data allow examination of the possi- 
bility that "functional G words elicit 
frequent associates which are similar 
denotatively to relevant L list words, while 
“nonfunctional” G words elicit associates 
more remote in denotation in relation to the 
L list words. Three of the four CNF words 
(soNc, LIFT, UP) have both primary and 
Secondary associates whose denotation ap- 
pears remote from the denotation of the 
relevant L list words. In contrast, none of 
the five CF words have both primary and 
Secondary associates which appear foreign 
to the pertinent L list words denotatively, 
although in one case the primary associate 
is a word opposite in meaning to an L list 
word. Judgment of similarity of denotation 
is here a priori, so the comparison is sug- 
gestive only. It appears, however, that this 
associational index of similarity, which as- 
Sumes that the associates of a word are 
indices of its “meaning” (Noble, 1952) 
lends some additional confirmation to the 


hypothesis that a meaning similarity factor 
other than scaled connotative meaning is a 
determinant of the differences in generalized 
response. 


Discussion 


The results of the series of experiments 
and the supplementary studies may be ex- 
amined with reference to the main hy- 
potheses of the research, which were 
specified in the initial section. The con- 
sistent elicitation of generalized response 
by groups of word stimuli selected for 
measured connotative similarity to learning 
list words appears to substantiate the hy- 
potheses. The fact that generalization 
occurs between stimuli selected solely on 
the basis of quantitative comparisons of 
independently-obtained meaning profiles for 
words suggests the usefulness of the con- 
ceptualization of meaning as a mediation 
process (Hypothesis а) and of Osgood’s 
operation formulation of meaning and 
meaning similarity (Hypotheses b and c). 

The hypotheses of the study did not, 
however, predict the variability in general- 
ized response which was observed. Aside 
from the problem of instability in response 
levels of groups of stimuli and individual 
words (which cannot be accounted for by 
any variables examined here), several con- 
siderations emerge from the data which 
require modification of the conclusion that 
the major hypotheses are confirmed. 

One kind of consideration has to do with 
factors other than measured connotative 
meaning which are related to the observed 
generalization. The verbal context of those 
stimuli which are crucial to the generaliza- 
tion paradigm (Figure 2) appears without 
question to be one of these factors. The 
pronounced shifts in generalized response in 
the context-change experiment, which ex- 
ceed the shifts in the replication studies, 
indicate the context factor is powerful 
enough to substantially alter the effect of 
the connotative meaning variable. The 
data indicate that context is likely to be a 
complicating factor in any verbal research 
of this kind in which more than a very 
small number of stimulus elements are 
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subjected to experimentation with the same 
subjects at the same time. 

The vividness of connotative meaning as 
measured by M is a factor determinable 
by the Osgood technique but which was not 
assumed relevant to generalization until 
after the data had been gathered. This 
factor also appears to be a major one in 
determining level of generalization, but need 
not be a serious problem in further research 
because of the relative ease with which it 
could be controlled. 

Because of difficulty in obtaining “риге” 
judgments of denotative relationships, the 
data pertaining to the role of denotative 
similarity in the present findings are not 
entirely satisfactory. Nonetheless the de- 
notative factor appears, not surprisingly, to 
be one which affects mediated generaliza- 
tion. The Osgood meaning measurement 
does not purport to measure denotation, and 
it is difficult to imagine any practicable ex- 
tension of the polar scaling technique to 
cover the almost infinite scope of differences 
in denotative meaning. Some control of 
denotation in further work on measured 
connotative meaning may, however, be 
necessary. 

A second kind of consideration, more 
serious in its implications for the interpre- 
tation of the data in relation to the major 
hypotheses, is the fact that some stimuli 
selected: for connotative similarity to learn- 
ing list words unvaryingly failed to elicit 
any generalized response. For at least 4 
of the 16 G words, high similarity of con- 
notative meaning in relation to learning list 
words did not result in generalization. If 
a stringent scientific logic is adopted, even 
a single negative instance of an affirmative 
generalization (or set of generalizations) 
demands refutation of the generalization. 
If the individual G word is taken as a unit 
of. evidence, 25% of the hypothesized in- 
stances. of generalization may be said to 
have failed to materialize in any of three 
independently conducted sets of experi- 
ments. The question may be raised whether 
any of the hypotheses may be considered 
confirmed in view of this amount of nega- 
tive evidence. 


However, the contemporary viewpoint on 
the use of networks of hypotheses and 
"open" definition of theoretical constructs 
(Braithwaite, 1953; Pap, 1953), urges 
flexibility in modification of a set of hy- 
potheses which appear in the main confirmed 
by evidence, but which are also partially at 
variance with the evidence. An entire set 
of hypotheses need not be considered re- 
futed by instances of negative evidence, but 
instead a decision can be made as to the 
point in the theoretical framework which 
may be most strategically modified. In this 
view, any of the Hypotheses a, b, or c 
could be considered inadequate on the 
basis of the negative evidence, and cor- 
respondingly modified. The remaining hy- 
potheses would be retained on the basis of 
substantiation by the positive findings. The 
mediation hypothesis ( Hypothesis a) might, 
for instance, be treated as disconfirmed, at 
least for some of the generalization stimuli. 
Alternatively, Rsp might be considered to 
have been shown inadequate as an index of 
the mediator (Hypothesis Р). The Semantic 
Differential might be regarded as a measure- 
ment which overlaps partially with the ele- 
ments of the mediator TmSm, but which is 
not sufficiently similar for precise measure- 
ment. Or it might be hypothesized that the 
mediation hypothesis and the' meaning 
measurement technique are valid, but that D 
adequately measures connotative similarity 
for only part of the interword comparisons 
(Hypothesis c). 

Modification of Hypothesis b seems the 
most plausible way to interpret the negative 
evidence from the present study. The medi- 
ation hypothesis seems the best confirmed 
of the three on the basis of previous re- 
search on secondary generalization, and it 
seems reasonable to interpret it as addition- 
ally substantiated by the main body of the 
present evidence. The hypothesis that Rep 
is a suitable operational reduction of the 
mediation construct seems, in contrast, only 
partly confirmed by the present evidence. 
The positive evidence is interpretable as 
lending some “construct validity” (Cron- 
bach & Meehl, 1955) to the Semantic 
Differential as a measure of the repre- 
sentational mediation process, but the nega- 
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tive evidence suggests that Rgp cannot be 
considered isomorphic with this process. 
The role of denotative meaning and of 
meaningfulness in determining generaliza- 
tion strengthen this interpretation. It is pos- 
Sible that better control of extraneous fac- 
tors would function to increase the validity 
of Rsp, although consistent absence of 
generalization to the four CNF words sug- 
gests an upper limit to the increase. 

Explanation of negative evidence by 
modification of Hypothesis c is also a possi- 
bility. Alternative methods of measuring 
similarity (for instance, focus on factorial 
dimensions rather than profile differences 
with all scale deviations weighted equally) 
might reveal that D is the element at fault 
in the failure of connotative similarity to 
consistently determine generalization. The 
data on factors other than connotative mean- 
ing suggest, however, that Hypothesis Р 
would probably also have to be modified in 
any instance. 

Further research directed toward con- 
trolling or exploring the effect of context, 
meaningfulness, denotation, association, and 
other factors relevant to verbal generaliza- 
tion should reveal the extent to which this 
interpretation of the present data is ade- 
quate, and should make more precise the 
extent to which Rsp can be considered a 
valid measurement of “meaning” as treated 
by the Osgood theory. 


SUMMARY 


A set of three hypotheses derived from 
Charles Osgood’s approach to the nature 
and measurement of meaning were ex- 
amined experimentally. The hypotheses 
concerned meaning as a representational 
mediation process, the validity of the 
Semantic Differential technique as a meas- 
urement of connotative meaning, and the 
Osgood-Suci index of profile similarity. 


Semantic Differential profiles of words 
which had been previously scaled were used 
as the basis for determining connotative 
meaning. Six experimental studies were 
conducted to examine the hypothesis that 
mediated stimulus generalization will occur 
between stimuli with similar connotative 
meaning profiles. A lever-press response 
was taught to each member of clusters of 
connotatively similar words, and generaliza- 
tion was tested by presentation of additional 
words which were connotatively similar to 
the training words. The series of experi- 
ments included replication for stimulus 
items, alteration of verbal context of the 
stimuli, and exact replication of two experi- 
ments to determine the stability of general- 
ized response. Supplementary analyses were 
done to determine the role of associative 
habits, usage frequency, meaningfulness, 
and denotative similarity. 

The hypothesized generalization was ob- 
tained in each of 12 analyses of responses 
to groups of generalization stimuli, How- 
ever, wide variability in generalized re- 
sponse was observed which could not be 
accounted for by the connotative meaning 
variable. A group of stimuli was identified 
for which generalized response unvaryingly 
failed to occur, in spite of the theoretical 
equivalence of these stimuli and the others. 

The verbal context of the stimuli, con- 
notative meaningfulness, and judged deno- 
tative meaning similarity were found to 
relate to the variation in generalized re- 
sponses. Usage frequency and associative 
habits did not appear to relate to variability 
in generalization. 

The results were interpreted as sub- 
stantiating the mediation hypothesis and 
as lending some degree of construct validity 
to the Semantic Differential as a technique 
for measuring meaning. Alternative inter- 
pretations of the findings and implications 
for further research were discussed. 
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APPENDIX А. 


SEMANTIC DIFFERENTIAL SCALES USED BY JENKINS, RUSSELL, AND SUCI 


cruel : kind 
curved : : straight 
masculine : Б feminine 
untimely : : : timely 
active : $ раѕзіуе 
savory tasteless 
unsuccessful = successful 
hard : E soft 
wise : : foolish 
new : 3 old 
good : Б Баа 
weak : H strong 
important : : unimportant 
angular : 3 rounded 
calm H excitable 
false 5 : true 
colorless : c colorful 
usual : unusual 
beautiful : z ugly 
slow : : : fast 
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APPENDIX B 
Factor LoADINGS OF THE SCALES UsED BY JENKINS, RUSSELL, AND SUCI 
Factor 
I II ш IV у VI уп | VIH т 
Scale 
Evalu-| Po- | Activ-| Sta- | Taut- | Nov- | Recep-| Un- 
ation | tency | ity bility | ness elty | tivity | named 

Good-bad 1.00 00 -00 00 -00 .00 .00 .00 1.00 
Timely-untimely 37 04 04 05 05 01 05 01 15 
Kind-cruel 52 —28 00 16 —07 02 12 —07 41 
Beautiful-ugly 52 —29 —02 03 —06 06 14 02 38 
Successful-unsuccessful 51 08 29 06 00 06 09 12 38 
Important-unimportant 38 04 31 04 00 | —02 09 02 25 
"True-false $0 | —03 01 29 | —06 | —01 00 05 34 
Wise-foolish 57 06 11 22 —03 —02 10 05 40 
Hard-soft —24 97 00 00 00 00 00 00 1.00 
Masculine-feminine —14 47 03 —01 16 —05 —01 06 27 
Strong-weak 30 40 10 12 00 —03 04 11 28 
Active-passive 17 12 98 00 00 00 00 00 1.00 
Excitable-calm —15 03 26 | —13 00 05 13 | —04 13 
Fast-slow 01 26 35 —05 15 —01 05 15 24 
Angular-rounded —12 26 16 —06 95 00 00 00 1.00 
Straight-curved 08 12 14 06 27 05 —03 —02 12 
New-old 20 | —09 09 00 05 97 00 00 | 1.00 
Unusual-usual —04 02 03 00 03 25 12 03 08 
Savory-tasteless 23 —12 18 04 —05 06 95 00 1.00 
Colorful-colorless 20 | —20 09 | —04 | —10 09 27 08 18 


SS лалы ч à T 


CONNOTATIVE MEANING AND STIMULUS GENERALIZATION 27 


APPENDIX C 


SAMPLE PAGES FROM DENOTATIVE JUDGMENT STUDY 


I 


BODICE 
Charming Girl Fragrant Song 
II 
BODICE 
LOVELY 
PRETTY 
WOMAN 
Charming Girl Fragrant Song 
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SOME ANTECEDENTS AND CONSEQUENTS OF 
MASCULINE SEX-TYPING IN ADOLESCENT BOYS 


PAUL MUSSEN + 
University of California, Berkeley 


research indicating that appropriate 
sex-typing of behavior in young boys is a 
consequence of strong identification with 
the father (Levin & Sears, 1956; Mussen 
& Distler, 1959, 1960; Payne & Mussen, 
1956). According to most of the findings, 
strong emotional allegiance is a major 
antecedent condition of such identification. 
The evidence thus generally appears to 
support what Mowrer (1950) has labeled 
the "developmental identification hypothe- 
sis" which states that identification repre- 
sents “ап attempt on the part of the infant 
to reproduce bits of the beloved and longed- 
for parent" (p. 615). 

Among the most pertinent findings are 
those of Levin and Sears (1956) who 
found that nursery school boys who were 
highly aggressive in doll-play—presumably 
а manifestation of masculine sex-typed be- 
havior—tended to be strongly identified with 
their fathers. In another study in which 
doll-play stories were used to assess the 
familial attitudes of kindergarten boys, 
those with high scores on a semiprojective 
test of masculinity (the IT Scale) revealed 
Stronger attachments to their fathers than 
did boys who scored low in this test ( Mus- 
sen & Distler, 1959). 

Obviously, the process of sex-typing is 
not completed during early childhood, but 
continues well beyond this period. The 
relationship between later phases of this 
Process and parental identification has not 
been systematically studied, although there 
: ЭЗЕ? 

+The author gratefully acknowledges the co- 
Operation of the late Harold E. Jones, Director of 

, the Institute of Human Development, University of 
" California, Berkeley, in making the extensive data 
| of the Adolescent Growth Study available for this 
| research, 


Is is a small but substantial body of 


is some suggestive evidence that "boys who | 
are closely identified with their fathers tend | 
to have more characteristically masculine | 
attitudes than their peers who are less | 
highly identified with their fathers" (Payne 
& Mussen, 1956, p. 361). 

The present research is concerned, first, 
with the nature of the parent-child relation- 
ships antecedent to a high degree of mas- 
culine sex-typing in adolescent boys, and 
secondly, with some concurrent and sub- 
sequent correlates (henceforth referred to 
as consequents) of high masculinity during 
this period. The measure of appropriacy 
of sex-typing was the degree of masculinity 
of interests. 

The first hypothesis of the study was 
based on Mowrer's "developmental hypothe- 
sis of identification" and the evidence, cited 
above, supporting it (Mowrer, 1950). In 
specific terms, the hypothesis states that 
adolescent boys whose interests are strongly 
and appropriately sex-typed regard their 
relationships with their fathers as favorable 
and rewarding, while boys whose interests 


‘are not so characteristically masculine are 


less likely to consider their interactions with 
their fathers so satisfying. 

The second hypothesis is based on the 
assumption that as the boy matures, he is 
likely to encounter increasing familial, peer, 
and general societal pressure to identify | 
with his father and thus to "learn to think, | 
feel, and act like a member of his own sex" 
(Brown, 1958)—i.e., to adopt sex-appropri- | 
ate motivational patterns and personality 
characteristics as well as overt masculine 
behavior and interests. It might therefore 
be anticipated that with increased age, the 
various qualities that compose the male sex 
role form a coherent pattern, becoming 
more consistent, crystallized, and consoli- 


2 PAUL MUSSEN 


dated. This expectation constituted the 
second hypothesis of the present study 
which, stated very generally, maintains that, 
among adolescent boys, those who acquire 
highly masculine interests will also develop 
personal qualities generally considered to 
be characteristic of males in our culture. 
In contrast, adolescent boys who have de- 
veloped relatively feminine interests—and, 
it may be inferred, are more strongly identi- 
fied with the opposite-sex role—might be 
expected to manifest more feminine personal 
and social characteristics. 

In order to test this hypothesis, it was, 
of course, necessary to define masculine and 
feminine traits. In the present study, the 
definitions were guided by Parsons' analysis 
of the instrumental-expressive (or task vs. 
social-emotional) polarity of functions of the 
male and female sex roles (Parsons, 1955) 
and, as we shall see later in greater detail, 
Brim's (1958) specification of the personal 
qualities attributable to these sex roles. 
Phrased more precisely in terms of Parsons' 
conceptualization of the two sex roles, the 
second hypothesis states that high mascu- 
linity of interests will be positively corre- 
lated with instrumental characteristics (e.g., 
adequacy, achievement needs, control), and 
low masculinity (ie. relatively feminine 
interests) will be associated with emotional- 
expressive qualities (e.g., affection, de- 
pendence, gregariousness) that describe the 
female role. 

The third and fourth hypotheses were 
concerned with contemporaneous and long- 
term consequents of appropriate sex-typing 
of behavior and interests during adolescence 
on general social and emotional adjustment. 
It may be assumed that adolescent boys 
who attain a high degree of appropriate 
sex-typing of behavior—thus fulfilling the 
expectations of parents, peers, and society 
at large—experience greater degrees of 
social acceptance and more favorable socio- 
psychological milieux than do their peers 
who are less masculine in behavior and 
attitudes. If this is a valid assumption, it 
can be hypothesized that adolescent boys 
who are strongly identified with the male 
sex role are more likely to become more 
stable emotionally and better adjusted 


socially than those who are low in mascu- 
linity (Hypothesis 3). 

The fourth hypothesis was based in part 
on the widely accepted belief that the 
quality of the individual’s adjustment dur- 
ing adolescence strongly affects his subse- 
quent, adult adjustment. If this is in fact 
true, and if the third hypothesis above is 
verified, then it follows that a high degree 
of appropriate sex-typing during adolescence 
will be more closely related to adequate 
personal and social adjustment in adulthood 
than will a low degree of masculine sex- 
typing during this period (Hypothesis 4). 

Extensive and intensive longitudinal rec- 
ords from the University of California 
Adolescent Growth Study (Jones, 1938, 
1939a, 1939b, 1940), obtained during the 
subject’s adolescence and adulthood (early 
thirties), were used in testing the major 
hypotheses. The data were collected by 
various investigators associated with the 
study in connection with their own research 
problems (see, for example, Carter, 1940, 
1944a, 1944b; Frenkel-Brunswik, 1942; 
Newman, 1946; Tryon, 1939a, 1939b, 1943, 
1944; Tuddenham, 1941, 1952, 1959; Tud- 
denham.& McBride, 1959). Included were 
masculinity-femininity scores, projective test 
results, and an array of observational 
ratings and tests (some collected during 
adolescence, some during adulthood) which 
provided a basis for the evaluations of per- 
sonality structure and of emotional and 
social adjustment. These will be described 
more fully below. 


PROCEDURE 
Subjects 


During their senior year in high school, at ages | 
17 and 18, 68 boys who were subjects of the 
Adolescent Growth Study were given the Strong 
Vocational Interest Blank (Strong, 1943) by 
Carter (1940). This test yields, in addition to 
occupational interest scores, а Masculinity-Femi- 
ninity (MF) score indicating the degree of 
similarity of the individual's interests to those 
characterstic of American men in general (Strong, 
1943). In the present study, these scores con- 
stituted the criterion of masculine sex-typing 
of interests. The subjects of the study were 39 ` 
boys representing two extreme contrasting groups; § 
the 20 with the most masculine scores, and the 19 
with the most feminine (least masculine) scorés. 
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Tests and Ratings 


The basic data used-in testing the major hy- 
potheses were derived from diverse sources: (a) 
а series of personality tests, some administered 
during the senior year in high school and others 
approximately 16 years later; (b) ratings of 
drives, appearance, personality, and social behavior 
made by trained observers who were members 
of the Adolescent Growth Study staff; and (с) 
results of sociometric questionnaires (Reputation 
Tests) answered by the subjects’ peers. More 
specifically, tests of the first three hypotheses in- 
volved the following test records and ratings 
collected during the same academic year as the 
Strong Vocational Interest Inventories, the senior 
year of high school : 

l. Responses to an individual TAT, consisting 
of 18 pictures, flashed on a screen (administered 
by H. E. Jones, described by Mussen & Jones, 
1957). Nine of the pictures came from the Murray 
set which is now standard (Cards 1, 5, 6, 7 BM, 
10, 11, 14, 15, 17); five pictures from the set 
current in 1938 when these data were collected 
(a man and a woman seated on a park bench; 
а bearded old man writing in an open book; a 
thin, sullen young man standing behind a well- 
dressed, older man; a tea table and two chairs; an 
abstract drawing of two bearded теп); and four 
pictures not in the Murray series, designed to elicit 
the expression of feelings and emotions (a ma- 
donna and a child, the nave of a large church, a 
dramatic view of mountains, a boy gazing at a 
cross wreathed in clouds). Most of the stories 
told in response to the pictures were very brief, 
consisting of only one or two sentences. 

A scoring scheme with a total of eight needs, 
familial press, and descriptive categories, each 
defined as specifically as possible, was employed in 
analyzing these protocols. The assumption under- 
lying the use of this scheme was that the story- 
teller identifies with the hero, the hero's needs and 
self-conceptions being the same as the storyteller’s, 
the press impinging on the hero being the ones 
that the subject perceives as affecting himself. 
The subject's score in each TAT category was 
derived simply by counting the number of stories 
in which a response in the given TAT category 
appeared. Tables 1 and 2 present the categories 
used, together with brief definitions of each. 

The reliability of this analysis was tested by 
having 10 complete protocols (180 stories) inde- 
pendently scored by the author and another 
Psychologist. The percentage of interrater agree- 
ment was 93 computed by the usual formula 
(number of agreements divided by number of 
agreements plus number of disagreements). In 
Order to eliminate bias, the scoring used in the 
Present study was done "blind," i.e., independently 
of knowledge of subject's masculinity status. Some 


А 


„21 am indebted to Walter Turner for his par- 
* ticipation in this aspect of the study. 


of the TAT scores were used in testing the first 
hypothesis; others for testing the second. 


2. Clubhouse ratings, made during the subjects’ 
adolescence by three members of the Adolescent 
Growth Study staff were based on intensive ob- 
servations of behavior and attitudes in a specially 
constructed clubhouse- where the boys and girls 
in the study met in mixed groups, conversed, 
played, and danced (Newman, 1946). The seven- 
point rating scales developed for this purpose 
(Jones, 1940), dealt with specific aspects of be- 
havior and included : 

detailed criteria concerning a number of expres- 

sive characteristics (poise, reserve, energy output, 

talkativeness, etc.), items involving social orien- 
tation (interest in the same and in opposite sex, 
drive for social contacts, discrimination in social 
contacts, etc.); items involving social status 

(popularity, stimulus value, leadership, etc.) ; 

and a series of other items pertaining to appear- 

ance, attitudes, and activities (р. 236). 


3. Institute ratings were made during the sub- 
jects’ periodic visits to the Institute of Human 
Development for mental and physical examinations 
(Newman, 1946). The subjects generally came 
to the institute in small, same-sex groups (six to 
eight at a time). Three staff members observed 
their free play behavior in this situation and rated 
the subjects on a series of personality characteristics 
and social behaviors comparable to those included 
in the clubhouse ratings, (eg., expressiveness, 
masculine behavior, sociability, relaxedness, cheer- 
fulness, carefreeness, and confidence). 


4. Drive ratings. About a year after the sub- 
jects had graduated from high school, three staff 
members, working under Else Frenkel-Brunswik’s 
direction, and using many accumulated sources of 
data (particularly the behavior ratings described 
above), rated most of the boys on nine “under- 
lying” drives: autonomy, social ties, achievement, 
recognition, abasement, aggression, Süccórrance, 
control, escape (Frenkel-Brunswik, 1942). These 
‘drive ratings presumably refer to “a level of per- 
sonality which stands behind the surface of overtly 
displayed social techniques" (p. 144). “Though 
ultimately referred to observed behavior, reference 
to the underlying motivations was established as 
the result of a complex process of inference 
utilizing more subtle, indirect cues together with 
gross features of behavior" (p. 261). In short, 
the drive ratings "constituted an attempt to pene- 
trate beneath the surface manifestations of per- 
sonality and gauge directly the central motivational 
organization of each subject, using whatever com- 
bination of observation and intuition the rater felt 
appropriate" (Tuddenham, 1959, p. 10). 

5. The Reputation Test was designed to measure 
the subjects' reputations among their classmates 
(Tryon, 1943; Tuddenham, 1941). Word portraits 
(eg, "here is someone whom everyone likes") 
were presented to pupils who supplied the names 
of classmates they regarded as fitting these. The 
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frequency of nominations provided the basis for 
determining the subject's reputation with respect 
to such traits as restlessness, talkativeness, activity, 
humor, friendliness, attention-getting, etc. 

The clubhouse, institute, and drive ratings, and 
data from the Reputation Test were used in test- 
ing Hypotheses 2 and 3. 

6. The University of California Inventory, a 
self-report schedule of 270 items formulated in 
such a way as to permit an indirect approach to 
the assessment of several personality characteristics 
and adjustment in several areas (Tryon, 1934, 
1939a, 1939b), yielded further data relevant to 
the verification of the third hypothesis. It pro- 
vided scores in the following categories: family, 
school, and social adjustment; personal inferiority ; 
physical symptoms; fears; general tension; over- 
statement; total adjustment. 

In order to test the fourth hypothesis, which 
deals with the adult adjustment of boys who were 
high and low in masculinity of interests during 
adolescence, two tests were administered to many 
of the Adolescent Growth Study subjects about 
16 years after they graduated from high school: 

7. The California Psychological Inventory, an 
Objective personality test, attempts to assess as- 
pects of personality, such as motives, maturity, and 
achievement, which are significant in social rela- 
tionships and interpersonal behavior. There are 
18 scales which describe individuals in terms of 
social responsibility, tolerance, flexibility, academic 
motivation, femininity, self-control, capacity for 
status, dominance, etc. (Gough, 1957). 

8. The Edwards’ Personal Preference Schedule 
(Edwards, 1954) is a standardized self-report 
device for adults which purports to measure 15 
basic personality needs originating in Н. A. 
Murray's system (eg, achievement, deference, 
order, exhibitionism, autonomy, affiliation, suc- 
corance, abasement, heterosexuality). 


RESULTS 


The hypotheses were tested by determin- 
ing whether or not the scores derived from 
the various test, rating, and sociometric 
instruments were significantly related to 
masculinity of interests. A frequency dis- 
tribution of the scores of all subjects was 
constructed for each variable, and each 
distribution was dichotomized as closely as 
possible to the median. Subjects having 
scores above the dichotomization point were 
considered "high" in this particular vari- 


* While there were 39 subjects, not all scores 
and ratings were available for all subjects on all 
variables; hence, the total frequency in some of 
the distributions was less than 39. 


able; those with scores below this point 
were considered "low." * Chi square tests 
were then applied to ascertain whether or 
not high scores in certain variables were, 
as was predicted on the basis of the hy- 
potheses, more characteristic of one group 
—those high or low in masculinity of in- 
terest—than of the other. It should be noted 
that the hypotheses tested were one-sided 
hypotheses, while the chi square value is in 
terms of a two-sided hypothesis. When chi 
square has only one degree of freedom, the 
square root of chi square has a distribution 
which is the right-hand half of a normal 
distribution. In order to test a one-sided 
hypothesis, the chi square test must be con- 
verted into the equivalent value in terms of 
a unit normal deviate (Fisher, 1938). The 
levels of significance reported in this section 
were evaluated in these terms. 

The findings are presented below in four 
sections, each dealing with one of the major 
hypotheses of the research. 


Hypothesis 1 


The first hypothesis stated that adolescent 
boys with highly sex-typed patterns of in- 
terests perceive their relationships with 
their fathers as more favorable than boys 
low in masculinity of interests do. On the 
basis of this hypothesis, it was predicted 
that the TAT stories of the former group 
would contain significantly more instances 
of "positive father relationship" (i.e., higher 
scores in this variable which presumably 
reflect rewarding interactions with the 
father) than the protocols of the group low 
in masculinity would. 

Table 1 summarizes the TAT findings 
on the familial need and press variables. 
According to the table, a significantly 
greater portion of the highly masculine boys 
did, in fact, score high on the “positive 
father relations” variable. Thus, the evi- 
dence supported the prediction and the 
hypothesis (Hypothesis 1)—essentially the 


* Variables with peculiar distributions of scores 
(eg., very narrow range of scores with half or 
more of the cases concentrated at the median 
score) were eliminated from further consideration. 
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" TABLE 1 
NuMBER IN Нісн AND Low MascuLiNITY Groups 5совіхс Нісн IN TAT FAMILIAL VARIABLES 
Highs with | Lows with 
TAT Variable Definition high scores | high scores ГА 
(№ = 19) | (N = 20) 
Father positive (F +) Hero is loved, helped, encouraged, or 
given something by father 12 5 .02 
Father negative (F —) Hero is rejected, scorned, disapproved by 1 5 .08 
father; or father forces hero to do some- 
thing or prevents him from doing some- 
thing 
Mother positive (M 4-) Mother behaves positively toward hero; 8 11 ns 
analogous to F+ 
Mother negative (M—) | Mother behaves in negative way toward 9 10 ns 
E hero; analogous to F — 


"developmental identification hypothesis— 
Írom which it was derived. 

Of the total group of 39 subjects, only 
» Six told one or more stories involving “nega- 
E father relationships." Of the six who 


told stories that could be scored in this 
category, five were low in masculinity of 
interests and only one was in the highly 
masculine group; that is, 18 of the 19 highly 
“Masculine boys, but only 15 of the 20 low 
in masculinity, scored zero in this variable. 
Insofar as absence of instances of responses 
1 in this category indicates lack of dissatis- 
| faction with the father—or, favorable per- 
ceptions of that parent—the direction of 
this difference clearly reinforces the results 
_ with the “positive father relationships" vari- 
able, and thus provides further support of 
| the developmental identification hypothesis. 
In view of the small number of cases in- 
_ Volved, however, the finding must be con- 
; sidered only suggestive. 
According to these data, the two groups 
_ Of subjects did not deviate significantly 
from each other in either of the ТАТ 
variables involving relationships with the 
mother (“favorable mother relationships" 
and “unfavorable mother relationships"). 
Apparently a high level of masculinity of 
interests among boys is related to warm 
апа affectionate relationships—and rela- 
| tively little feeling of dissatisfaction—with 
the father, but is not strongly influenced 


by the nature of the boy's relationships with 
his mother. This finding seems entirely 
consistent with Mussen and Distler's (1960) 
conclusion that, among kindergarten boys, 
"high degrees of masculinity were fostered 
by affectionate father-son interactions, but 
were not significantly affected by mother- 
son relationships" (p. 98). 


Hypothesis 2 


A number of test scores and ratings 
were pertinent to verification of the second 
hypothesis. According to this hypothesis, 
high masculinity of interests is associated 
with instrumental characteristics, while rela- 
tively feminine interests are linked with the 
so-called emotional-expressive qualities. In 
devising tests of this hypothesis, all TAT, 
rating, and sociometric variables were ex- 
amined to determine which of them could 
be categorized as instrumental or expressive 
traits. Decisions about the assignment to 
the male (instrumental) or female (expres- 
sive) categories were based on Parsons’ 
(1955) writings and on the consensus of 
opinions of Brim and several other sociolo- 
gists who had judged the congruence of 
these and similar traits with the instrumental 
and expressive roles (reported in Brim, 
1958). 

Some of the variables relevant to Hy- 
pothesis 2 might also be regarded as rele- 
vant to Hypothesis 3 which is concerned 
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with the relationship between sex-typing 
and general adjustment. For example, 
certain instrumental characteristics, such as 
self-confidence and self-control, are also re- 
garded as manifestations of good adjust- 
ment. The decision to designate a given 
characteristic as primarily pertinent to the 
second (sex-role characteristics) hypothesis, 
rather than to the third (general adjust- 
ment) hypothesis, was admittedly arbitrary 
in a number of instances. 

In the following discussion of the results 
of tests of the second hypothesis, the de- 
pendent variables (personality characteris- 
tics and motivation) are grouped according 
to the methods used in obtaining the meas- 
ures. For example, three TAT categories, 
listed and defined in Table 2, were judged 
to be pertinent to Hypothesis 2. It was 
predicted that, for each of these variables, 
the scores of boys high and low in mas- 
culinity of interests would be significantly 
different. More specifically, if the hy- 
pothesis is valid, those who had acquired a 
high degree of appropriate sex-typing of 
interests would score higher in need 
Achievement (corresponding to the instru- 
mental characteristic ambition) and need 
Aggression (related to “aggressiveness,” an 
aspect of the male role). However, they 
would be lower in negative characteristics, 
low scores in this category presumably in- 
dicating greater self-esteem and self-confi- 
dence (characteristics assigned to the male, 
instrumental role). 


TABLE 2 


NUMBER OF SuBJEcTS IN Hick AND Low MASCULINITY GROUPS SCORING HIGH IN TAT 
VARIABLES RELEVANT TO Hypotuesis 2 
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Examination of Table 2 shows that the 
two groups differed significantly from each) 
other only in the negative characteristics 
score. As predicted, assuming the subject’s 
descriptions of the heroes of his stories 
reflect his feelings about himself, the highly 
masculine interest group manifested fewer 
negative self-concepts and; indirectly, more]. 
positive self-concepts and self-confidence.) 
In brief, as far as the TAT data are con- 
cerned, support for the hypothesis is very 
limited; and there is no evidence that the? 
two groups differ with respect to underlying 
achievement and aggressive motives, which 
are pertinent to the male role. It should be 
noted, however, that the measure of aggres-]- 
sive motivation used here may be fallible 
as an operational criterion of aggression as 
that term is used by Parsons (1955) or 
Brim (1958). These sociologists may be 
referring primarily to dominant and com- 
petitive behavior, while the TAT measure. 
used here may be essentially an index of 
the strength of underlying hostility. 

The series of clubhouse and institute rat- 
ings also yielded basic data on the manifest 
personality characteristics associated with 
the two sex roles. Table 3 lists the specific 
variables related to the second hypothesis” 
that were derived from these scales, the 
number of subjects in each of the two 
groups rated high (above the dichotomiza- | 
tion point) in these characteristics, and the 
exact probabilities of obtaining these distri- 
butions of high and low scores in the two 


р, Highs with | Lows with 
TAT Variable Definition high scores | high scores p>. 
(№ 219) | (N = 20) 
ES 
n Achievement Hero attempts to attain a high goal or do 1 8 ns 
F something creditable 

n Aggression Hero expresses hostility in physical or 7 11 ns 
verbal way or has aggressive thoughts 
or feelings 

Negative characteristics | Hero is described in negative terms (e.g., 6 15 .01 
stupid, weak, unpleasant) 
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TABLE 3 


NuMBER oF SUBJECTS IN HIGH AND Low Mascutinity Groups RATED HIGH IN CLUBHOUSE 
AND INsTITUTE SCALES RELEVANT TO HYPOTHESIS 2 


Highs rated | Lows rated 
Rating Dimension Scale high high 72 
(N 219) | (N = 20) 

1. emotional dependence, opp. sex Clubhouse 6 14 ‚02 
2. emotional dependence, same sex Clubhouse 6 11 .08 
3. social interest, opp. sex Clubhouse 7 13 ‚04 
4, "talking" interest, opp. sex Clubhouse 7 13 .04 
5. predominant heterosexual orientation Clubhouse 6 11 .08 
6. attention-seeking, same sex Clubhouse 8 10 ns 
7. attention-seeking, opp. sex Clubhouse 6 13 .04 
8. deliberative Clubhouse 11 10 ns 
9. cooperativeness Clubhouse 11 14 ns 
10. resolute Clubhouse 11 11 ns 
11. responsible Clubhouse 7 10 ns 
12. exploitiveness Clubhouse 8 11 ns 
13. leadership Clubhouse 8 9 ns 
14, self-confidence with opp. sex Clubhouse 8 10 ns 
15. self-confidence with same sex Clubhouse 9 10 ns 
16. masculine behavior Institute 12 4 .01 
17. expressiveness Institute 11 8 ns 
18. sociability Institute 12 11 ns 
19. submissiveness Institute 12 12 ns 
20. confidence (poise) Institute 11 13 ns 
21. assuredness Institute 8 8 ns 
22. matter-of-fact Institute 8 11 ns 
23. unaffected Institute 11 9 ns 
24. obvious confidence (physical) Institute 10 8 ns 
25. masculine physique Institute 14 9 .10 
26. good musculature Institute 14 8 .05 


groups (or all other possible, more extreme 


Sets), calculated in accordance with Fisher's 

(1938) method, 

„As this table indicates, there were sig- 
nificant. differences, or marked trends to- 
Ward significant differences, between the 
high and low masculinity groups in 6 of 
the 15 clubhouse ratings (Items 1 to 15 of 
Table 3) that dealt with variables cor- 
responding to—or congruent with—instru- 
mental or expressive traits. Compared with 
the group that had highly sex-typed mas- 
culine interests, more boys with relatively 
feminine interests were rated high, as had 
been predicted, in several emotional-expres- 
Sive characteristics, particularly those in- 
dicative of dependency, friendliness, and 
Sociability. More specifically, they were 
Perceived by trained observers as more 
emotionally dependent on both boys and 


girls (Items 1 and 2 of Table 3) and more 
interested in social activity and talking with 
girls (Items 3 and 4 of Table 3). More- 
over, they were rated as striving harder for 
attention—i.e., seeking responsiveness and 
approval—from girls (Item 7). Those low 
in masculinity were also rated as more 
heterosexually oriented than their peers 
with more masculine interests (Item 6). In 
view of their strong dependency and ap- 
proval needs, however, it may be inferred 
that their interest in girls probably reflected 
their general sociability and gregariousness 
rather than any marked ability to establish 
mature relationships with members of the 
opposite sex. It is also possible that the rela- 
tively high heterosexual interests of those 
low in masculinity of interests resulted from 
the fact that they were more likely to find 
social success and social gratifications in 
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interactions with girls than in interpersonal 
relationships with boys. 

Among the institute scales, nine, listed in 
Table 3 (Items 16 to 24), were judged to 
have relevance for testing the second hy- 
pothesis. The two groups were significantly 
differentiated on one very important rating, 
masculine behavior (Item 16), which pre- 
sumably expressed the judges' assessment 
of the overall masculine qualities of the 
subject. None of the other institute scales 
relevant to the second hypothesis signifi- 
cantly differentiated the two groups. 

It may be concluded that, in general, the 
clubhouse rating data appeared to be sup- 
portive of Hypothesis 2, but the evidence 
from the institute ratings, while consistent, 
was not impressive. This may be attributable 
in part to the nature of the behavior 
sampling on which these institute ratings 
were based. It will be recalled that this 
consisted of observations of the subjects' 
free-play behavior in small groups composed 
exclusively of boys. In such situations, 
overt reactions are likely to be more re- 
stricted in range and more masculine than 
they are in social settings involving mixed 
groups. It seems reasonable to assume that, 
when only boys are present, social cues 
for specifically masculine responses are 
likely to be more potent and distinctive, and 
rewards for such behavior—and punish- 
ments for sex-inappropriate reactions—are 
likely to be strong. Consequently, under 
these circumstances, less masculine boys are 
more likely to suppress tendencies toward 
feminine responses and more masculine be- 
haviors are likely to be manifested. In other 
words, it may be argued that the social 
settings in which the boys were observed 
at the institute tended to produce a kind of 
“leveling” of behavior, possibly obscuring 
important differences between the two 
groups. In spite of this, however, the boys 
with highly masculine interests impressed 
the observers as being more generally mas- 
culine in behavior than the other group. 

The two groups deviated significantly, or 
nearly significantly, from each other on two 
other institute ratings not directly relevant 
to the second hypothesis, but concerned 
with physical characteristics (Items 25 and 


26 of Table 3). High ratings in good mus- 
culature were significantly more common 
among the boys with highly masculine inter- 
ests and there was a trend toward more| 
high ratings in masculine physique in this 
group. These findings may suggest that the 
possession of masculine physical character- 
istics is conducive to the development of 
masculine interests and behavior. In this 
connection it should be noted, however, that 
the two groups did not differ in rate of 
physical maturing—i.e., they included equal 
numbers of late and early maturers (repre: 
senting the extremes of the distribution of 
rate of physical maturing )—or in ratings of 
androgyny, masculinity or femininity of body 
form (Bayley, 1951). This last result is 
similar to Bayley’s (1951) finding that 
“masculine-feminine variables in physique 
and interests . . . are largely independent 
of each other" (p. 59). 

While the clubhouse and institute ratings! 
represented the evaluations of the subjects' 
personality by trained adults, the Reputation 
Test results reflected their peers’ assess- 
ments of their characteristics. Seven of {һе 
Reputation Test scores, listed in Table 4, 
were considered to have bearing on the 
second hypothesis. 

As the table shows, the two groups 
differed significantly, or nearly significantly, 
in five of these. Compared with boys high 
in masculinity of interests, those with rela- 


TABLE 4 


NUMBER OF SUBJECTS IN THE Two GROUPS WITH 
Нісн SCORES IN REPUTATION TEST VARIABLES 
RELEVANT TO HyPorHEsIS 2 


Highs with | Lows with 


Variable high scores | high scores | Р 
(N = 15) | (N = 17) 

1. Talkative 7 12 .09 
2. Active 7 12 .09 
3. Humorous 6 12 .04 
4. Friendly 7 12 .09. 
5. Attention- 

getting 7 11 .16 
6. Daring 7 6 ns 
7. Humor 

about self 7 7 ns 
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' tively feminine interests were more fre- 
quently nominated in response to sociometric 
questions focused on expressive traits; i.e., 
they get more votes in the categories talka- 
tive (Item 1), socially active (Item 2), and 
humorous (Item 3). In addition, the 
sociometric data suggest a slight tendency 
for those low in masculinity to be perceived 
as more dependent—i.e., more attention- 
getting and approval-seeking (Item 6)— 
than their peers with highly masculine 

' interests. 

It is interesting to note that, although 
boys with relatively feminine interests were 
not on the average later than the other 
group in physical maturing, they appear to 
have personal qualities similar to those 
found to be characteristic of late maturing 
boys (high ratings in social activity, talka- 
tiveness, attention-seeking, еіс.—Јопеѕ & 
Bayley, 1950). From a sociological point 
of view, these characteristics are regarded 
as components of personality structure 
appropriate to the female, rather than the 
male, role. When they occur in adolescent 
boys, however, these qualities may be signs 
of immaturity and compensation for feel- 
ings of inadequacy (Jones & Bayley, 1950). 

The final set of data pertaining to the 
second hypothesis consisted of the drive 
ratings which, it will be recalled, sum- 
marized the intuitive assessments of the 
Subject's underlying motivations, made by 
à number of judges who were well- 
acquainted with the subjects. Four of the 
drives rated referred to aspects of the 
instrumental or male role: need Autonomy 
(corresponding to the instrumental trait 
independence); need Achievement, cor- 
Tesponding to ambition; need Aggression; 
and need Abasement (a low score pre- 
Sumably corresponding to high self-confi- 
dence). Three other drives were associated 
With the feminine or expressive role: Social 
ties (corresponding to friendliness) ; Recog- 
nition (responsiveness to sympathy and 
approval) ; and Succorrance (corresponding 
to dependency). Table 5 gives the number 
9f subjects in each group who received high 
‘ratings in each of these seven drives and 
the exact probabilities of this distribution 
of high and low scores in the two groups. 


TABLE 5 


NUMBER ОЕ SuBJECTS IN THE Two GROUPS WITH 
Нісн RATINGS IN Drives RELATED TO 
Hyrornssis 2 


Highs with | Lows with 
Drive high scores | high scores | p 

(N = 18) | (N = 19) 
1. Recognition 5 14 .01 
2. Social ties 8 13 07 
3. Succorance 7 12 07 
4. Autonomy 7 11 ns 
5. Achievement 7 11 ns 
6. Abasement 10 10 ns 
7. Aggression 10 11 ns 


The data also provide some limited sup- 
port for the hypothesis. Thus, adolescent 
boys low in masculinity of interests were 
more frequently judged to reveal motiva- 
tional patterns generally attributed to the 
feminine, expressive role. In specific terms, 
more of this group than of those high in 
masculinity of interests were rated high in 
drives for social ties, for recognition, and 
for succorrance (dependency). As in the 
case of the TAT data, however, there was 
no evidence that male sex-typed motives, 
such as independence, aggression, and 
achievement were more characteristic of 
the group high in masculinity. It may be 
concluded that as far as these data are 
concerned, masculine sex-typing of interests 
among adolescent boys is not necessarily 
accompanied by sex-typing of underlying 
drives, although absence of a high degree 
of sex-typing of interests in boys tends to 
be more closely associated with typically 
feminine motivations. 


Hypothesis 3 


If Hypothesis 3 is valid, boys who have 
identified strongly with the masculine role— 
i.e., have acquired strongly masculine inter- 
ests—will give evidence of better general 
adjustment and greater emotional stability 
than boys whose interest patterns are more 
feminine. As we noted earlier, some of 
the characteristics which were regarded as 
primarily concerned with the instrumental- 
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expressive polarity, and thus particularly 
relevant to Hypothesis 2, might also be 
interpreted as manifestations of good or 
poor adjustment, or of psychological ma- 
turity or immaturity and thus as pertinent 
to the third hypothesis. For example, it 
might plausibly be assumed that strong 
dependency needs and striving for attention 
and approval in adolescent boys are indi- 
cations of maladjustment, while self-confi- 
dence and independence may be signs of 
maturity and personal adequacy. As noted 
earlier, some investigators interpret high 
degrees of expressive traits among adoles- 
cent boys as reflections of immaturity and 
basic feelings of inadequacy (Jones & Bay- 
ley, 1950). Granting these assumptions and 
the tenability of these interpretations, some 
of the findings reviewed above—e.g., those 
concerning the highly masculine boys’ rela- 
tively greater independence and self-confi- 
dence, as well as their lesser tendencies 
toward attention-seeking, talkativeness, and 
sociability—may be considered evidence in 
support of the third hypothesis. More direct 
tests of the hypothesis were made, however, 
employing a number of general adjustment 
variables drawn from the University of 
California Adjustment Inventory (UCATI), 


the clubhouse and institute ratings, and the 
Reputation Test (one variable). 

Analysis of the scores on the ОСА! 
showed that the high and low masculini 
groups did not differ significantly from each 
other in any of the eight subtests, eai 
of which was designed to assess a differen 
area of adjustment. However, in six of 
them (family, personal inferiority, physical 
symptoms, fears, general tension, overstate- 
ment), the proportion of boys with highly 
masculine interests achieving high (better 
adjustment) scores was greater than Һе 
proportion in the other group. The differ- 
ence between the two groups in total score, 
a measure of overall adjustment made ир 
of all the subtests, approached the usual 
criterion of statistical significance (p < .08). 
It may, therefore, be tentatively concluded’ 
that data from this inventory are generally: 
consistent with, and thus support, the third 
hypothesis. 

Much more substantial evidence in favor 
of this hypothesis was found in the group 
comparisons in several clubhouse and insti- 
tute scales deemed to be measures of per- 
sonal adjustment or maladjustment. These 
are listed in Table 6. Among the clubhouse 
scales, there were 10 ratings concerned with 


TABLE 6 | 
NUMBER or SUBJECTS IN TRE Two Groups RATED HIGH IN CLUBHOUSE AND INSTITUTE 
ScaLes RELEVANT TO HYPOTHESIS 3 
ў 7 : Highs rated | Lows rated 
Rating Dimension Scale high high b 
(N = 19) (N = 20) 

1. Carefree Clubhouse 13 6 .02 

2. Content Clubhouse 12 7 .04 

3. Relaxed Clubhouse 12 8 .08 

4. Exuberance Clubhouse 11 7 .08 

5. Happy Clubhouse 11 7 .08 

6. Calm Clubhouse 12 8 .08 

7. Smooth in social functioning Clubhouse 12 8 .08 

8. Unselfishness Clubhouse 9 11 ns 

9. Constancy of mood Clubhouse 10 9 ns 
10. Well-adjusted Clubhouse 11 9 ns 
11. Carefree Institute 12 4 01 
12. Relaxed Institute < 14 8 .05 
13. Good-natured Institute 13 12 ns 
14. Popularity Institute 11 9 ns 
15. Cheerful Institute 10 12 ns | 
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feelings of happiness and well-being, free- 
dom from tension and conflict (Items 1 to 

10 of Table 6). 

Analysis of these data indicated that the 
two groups of subjects differed significantly, 
or tended to differ, in 8 of the 10 pertinent 

- variables. Compared with the other group, 

a greater proportion of those with highly 

sex-typed masculine interests was better 

adjusted in the sense that they were more 
carefree, more contented, more relaxed, 

‚ more exuberant, happier, calmer, and 
smoother in social functioning. Since all 
these differences were in the direction pre- 
dicted on the basis of the third hypothesis, 
the findings constitute strong support for 
this hypothesis. 

Further, although less impressive, con- 

- firmation of the validity of this hypothesis 
was found in the institute ratings, five of 
which (Items 11 to 15 of Table 6) dealt 
with personal adjustment. These data rein- 
forced those based on the clubhouse ratings 
summarized above in showing that signifi- 
cantly more of the highly masculine sub- 
jects were rated high by staff observers 
in the characteristics “relaxed” and “care- 
free.” These group differences in club- 
house and institute ratings, considered 
individually, all seem to substantiate the 
third hypothesis: i.e. they support the 
prediction that among adolescent boys, 
strong identification with the male role is 
associated with manifestations of emotional 
stability. 

As noted earlier, the institute and club- 
house ratings also constituted the basic 
data for Else Frenkel-Brunswik's (1942) 
analysis of more general genotypic “total 
Personality structure” made in connection 
with her study of motivation and behavior. 
_ Опе aspect of her study bears directly on, 
and further supports, the third hypothesis 
_ of the present study. From an examination 
of all male subjects’ sigma score profiles 
derived from 11 behavior rating scales, 
Frenkel-Brunswik located four small groups 
representing different combinations of ade- 
quate and inadequate social and emotional 
adjustment, She identified them as: (а) 
Socially successful and emotionally well- 
adjusted (12 cases), (b) socially successful 


but emotionally maladjusted) (8 cases), 
(c) socially unsuccessful and emotionally 
well-adjusted (8 cases), (d) socially un- 
successful and emotionally maladjusted (7 
cases). 

Relatively few of the subjects in these 
four groups were also included in the pres- 
ent study, but six of the eight “socially 
successful but emotionally maladjusted” 
cases were. It is interesting to note that 
five of these six were in the low masculinity 
group, and only one was in the high mas- 
culinity group. The number of cases in- 
volved is, of course, too small to permit 
broad generalization. However, this result 
seems to integrate and further substantiate 
the findings derived from the individual 
rating variables. More specifically, as the 
group differences in the particular institute 
and clubhouse ratings suggest, boys low in 
masculinity appear to achieve, at least 
superficially, a high level of social adjust- 
ment—probably as a result of their social 
initiative and friendliness—but this is likely 
to be accompanied by emotional maladjust- 
ment. In summary, according to global 
assessments of personal adjustment based 
on profiles of several aspects of personality 
(total personality structure), more of the 
subjects in the low masculinity group were 
emotionally maladjusted during adolescence. 
This finding seems entirely congruent with 
what would be expected if the third hy- 
pothesis were valid. 

Among the variables derived from the 
Reputation Test, only one was considered 
pertinent to the third hypothesis. The 
frequency with which peers describe the 
subject as restless (i.e. his restlessness 
score on the Reputation Test) probably 
reflects the degree of his overt expression 
of tension and nervousness. The two groups 
differed significantly, and in the direction 
predicted on the basis of Hypothesis 3, 
in the number of nominations received in 
response to the inquiry “Who seems rest- 
less?” Sixty-five percent of the group low 
in masculinity, but only 33% of the highly 
masculine group, were selected with high 
frequency (more than the median number 
of times) by their peers as fitting this 
description. 
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To summarize, considering all the rele- 
vant data from all the sources, the results 
substantially support the third hypothesis. 
It may be concluded that, as had been pre- 
dicted on the basis of this hypothesis, a 
high degree of masculine identification dur- 
ing adolescence tends to be associated with 
personal adequacy and emotional stability. 


Hypothesis 4 


This hypothesis was concerned with the 
long-range consequents of different degrees 
of masculine identification during adoles- 
cence. It was predicted that adolescent boys 
with highly masculine interests would 
achieve better personal adjustment and 
greater emotional security during adulthood 
than their peers whose adolescent interests 
were relatively feminine. 

At the time the present study began, little 
was known about the later personality and 
adjustment of the Adolescent Growth Study 
subjects. Adult ratings and tests analogous 
to the clubhouse and institute ratings or 
the UCAI were not available. Fortunately, 
however, as part of Tuddenham’s study 
(1959) of “yielding” behavior, the Cali- 
fornia Psychological Inventory and the Ed- 
wards Personal Preference Schedule were 
administered to a substantial number of 
the subjects when they had reached their 
early thirties, i.e., about 16 years after high 
school graduation. The responses of the 
two groups of subjects to these tests yielded 
some information about their adult moti- 
vational structures and personality charac- 
teristics, and, by inference, their general 
adjustments. Thus it was possible to make 
some limited tests of Hypothesis 4. 

Table 7 lists the subtests of the California 
Psychological Inventory and the number of 
subjects in the two groups with scores 
above the median for all the subjects in 
each of the scales. The picture of the 
adult statuses and personality structures of 
those who had been high and low in mas- 
culinity during adolescence that emerges 
from the group comparisons is indeed a 
complex one. 

Judging from their responses to this 
personality questionnaire, the highly mas- 
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culine group tended to have more masculine | 
attitudes, values, and reactions as adults; 


ie. more of them than of the group that : 
had been low in masculinity during adoles- 


cence scored low (less feminine) in a scale 
which purports to assess femininity of inter- 
ests and attitudes. In spite of the vast 


differences in the form and content of the | 


adolescent and adult indices of masculinity 
— respectively, the Strong Vocational Inter- 
est Inventory, focused primarily on occupa- 
tion preferences, and the inverse of CPI 
femininity, a score derived from the sub- 
ject's responses to 38 attitude and belief 
questions—the two measures tend to be 
associated (p < .08). Since there is sub- 
stantial evidence on the validity of each of 
these indices, it may be inferred that mas- 
culinity of interests and attitudes is a 
relatively enduring quality. 

Some indirect support for Hypothesis 


4 may be found in the fact that when they | 


became adults, more of the subjects who 
had been high in masculinity of interests 
during adolescence scored high on the CPI 
Ego Control scale, developed by Block. 


According to its author, this scale “reflects 


TABLE 7 


NUMBER OF SUBJECTS IN THE Two GROUPS WITH | 


Нісн Scores IN CALIFORNIA PSYCHOLOGICAL 
INVENTORY SUBTESTS 


Highs with | Lows with 
Scale high scores | high scores | p 
(N = 15) | (N = 12) 
Femininity 6 8 .08 
Ego control 9 4 .08 
Dominance 5 9 04 
Capacity 
for status 3 8 .02 
Self-acceptance 4 8 .05 
Sociability 7 7 "ns 
Social presence 4 8 .05 
Responsibility 7 8 ns 
Self-control 6 7 ns 
Good impression 7 1 ns 
Achievement via 
conformance 6 7 ns 
Achievement via 
independence 8 7 ns 
Flexibility 6 6 ns 
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"the impulse-control capacity of the indi- 
- yidual, whether he delays gratifications and 
— binds tensions excessively, or whether he 
tends to allow his needs immediate (too 
_ immediate) expression." * 
* Relatively high scores in this scale may 
_ be interpreted às evidence of a high de- 
ў gree of self-control, а characteristic ascrib- 
able to the instrumental or task role. It 
may therefore be inferred that a relatively 
greater proportion of those who were highly 
' identified with the male role during ado- 
lescence possessed at least this one major 
component of that role as adults. Insofar 
as scores above the median in ego-control 
reflect greater degrees of emotional stability 
and personal integration, it appears that the 
highly masculine adolescents tended to be- 
come better adjusted adults. This is fully 
in accord with the prediction generated by 
Hypothesis 4. 
~ On the other hand, as Table 7 shows, 
adult men who, as adolescents, had highly 
masculine interests scored lower than the 
other group in three CPI scales assessing 
attributes that appear to correspond to 
aspects of the instrumental role. The first 
of these, Dominance (Do), purports “to 
assess factors of leadership ability, domi- 
nance, persistence, and social initiative" 
(Gough, 1957, p. 12). The Capacity for 
Status (CS) scale attempts to measure the 
personal qualities and attributes which un- 
derlie and lead to high status, not actual 
or achieved status (Gough, 1957). The 
third scale, Self-Acceptance (SA), assesses 
"factors such as sense of personal worth, 
self-acceptance, and capacity for independ- 
ent thinking and action" (p. 12). All three 
Of these scales form part of the cluster 
Of "measures of poise, ascendancy, and 
self-assurance." 

Clearly this evidence cannot be regarded 
aS confirming the fourth hypothesis. In 
fact, the association between relatively low 
Scores on these three scales and masculinity 
Of vocational interests during adolescence 

appears to be directly contradictory to what 
Would be expected if this hypothesis were 
к valid. These results might therefore be 


. "Block, J. Personal communication, 1959. 


considered sufficient grounds for refuting 
the hypothesis. 

On the basis of a review of some studies 
of the correlates of these scales, however, 
it could reasonably be argued that these 
scales do not in themselves provide ade- 
quate data for testing the hypothesis. 
Gough (1957), the author of the CPI, 
found that men who score high in Do were 
viewed by peers as "aggressive, confident, 
persistent, and planful, as being persuasive 
and verbally fluent; as self-reliant and 
independent; and as having leadership po- 
tential and initiative" (p. 12). High scorers 
in CS were perceived as "ambitious, active, 
forceful, insightful, resourceful, and versa- 
tile; as being ascendant and self-seeking ; 
effective in communication; and as having 
personal scope and breadth of interests" 
(p. 12). Those high in the SA scale were 
described as “intelligent and outspoken, 
sharp witted, demanding, aggressive, and 
self-centered; as being persuasive and 
verbally fluent; and as possessing self- 
confidence and self-assurance" (p. 12). 
These descriptive phrases, attributed to high 
scorers in these subtests, strongly suggest 
that these scales do not measure only task 
or instrumental qualities. In addition, they 
appear to be heavily weighted with com- 
ponents of the expressive role: e.g., social 
poise, ease and skill in handling new inter- 
personal situations, and verbal fluency and 
effectiveness. Thus, instrumental and ex- 
pressive qualities seem to be confounded 
in these scales and this circumstance seri- 
ously limits the usefulness of the derived 
indices for the evaluation of sex-typing 
of personal characteristics during adult- 
hood. From the point of view of the socio- 
logical analysis of sex roles, high scores 
in these scales may be regarded as repre- 
senting some combination of highly devel- 
oped qualities congruent with both task and 
emotional orientations. 

If this is true, the fact that the subjects 
who had highly masculine interests during 
adolescence scored low, as adults, in the 
CPI Do, CS, and SA scales may be due, 
at least in part, to the constancy of certain 
personality characteristics. More specifically, 
these low adult scores may be a demonstra- 
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tion of the durability of the highly masculine 
boys' low status in emotional-expressive 
characteristics (e.g., social initiative, socia- 
bility, and. friendliness), noted during the 
adolescent period (see results of tests of 
Hypothesis 2, above). In short, the findings 
from the CPI scales do not necessarily 
indicate that the strength of the instru- 
mental traits of the highly masculine sub- 
jects diminished or that they became more 
maladjusted when they became adults. 
Unfortunately, the results of the Edwards 
Personal Preference Schedule which was 
given at about the same time as the CPI 
neither clarified the meaning of the CPI 
findings nor did it help extend the under- 
standing of the possible enduring effects of 
different degrees of appropriacy of mas- 
culine sex-typing during adolescence. As 
adults, the two groups deviated significantly 
from each other in only one of the 15 
needs assessed by this instrument. Those 
whose interests had been highly masculine 
during adolescence scored higher in need 
Abasement. This need was defined as fol- 
lows by Edwards (1954) : 
to feel guilty when one does something wrong, to 
accept blame when things do not go right, to feel 
that personal pain and misery suffered does more 
good than harm, to feel the need for punishment 
for wrongdoing, to feel better when giving in and 
avoiding a fight than when having one's own way, 
to feel the need for confession of errors, to feel 
depressed by inability to handle situations, to feel 
timid in the presence of superiors, to feel inferior 
to others in most respects (p. 5). 
If a high score in this variable reflects 
strong feelings of inadequacy or inferiority, 
the present result might mean that men who 
had had highly masculine interests during 
adolescence became more poorly adjusted 
adults than those whose adolescent interests 
were relatively feminine. Interpreted in 
this way, this finding may be seen as con- 
sonant with the interpretation that the CPI 
results cited above do, in fact, suggest that 
the highly masculine adolescents became less 
self-accepting (ie., less secure) adults. 
However, an alternative explanation of high 
scores in need Abasement is also possible; 
ie. such scores may be interpreted as indi- 
cations of ability and willingness to face 
and admit weaknesses and shortcomings 


objectively. Such ability probably depends 
on the possession of some basic feelings of - 
security and relative freedom from conflict, 
for severely maladjusted or unstable indi- 
viduals are likely to be afraid to face their 
personal problems directly. Some indirect 
evidence supporting this reasoning about 
the meaning of the need Abasement scores 
may be found in a study in which college 
students from homes described as “very 
cool-strong discipline"—which presumably | 
would not foster basic feelings of security - 
or substantial identification with parents— 
scored relatively low in this need, as the | 
subjects of the present study with relative 
feminine interests did (Schutz, 1958). 
Unfortunately, the available data are not 
adequate to permit the evaluation of these 
vastly different, sometimes contradictory, 
interpretations of the Edwards’ and CPI. 
scores that differentiated the two groups — 
of adults. Granting certain assumptions, - 
some of the findings may be judged to be 
consistent with, and supportive of, the, 
fourth hypothesis. If these assumptions are | 
not valid, however, the results offer no evi- 
dence in favor of the hypothesis. In brief, 
it must be concluded that, on the basis of | 
the present data, Hypothesis 4 cannot be 
either confirmed or refuted. | 


Discussion 


The assumption underlying this study is 
that identification is essentially a secondary 
drive, producing behavior in the child which 
replicates his parents'—and particularly the 
like-sex parent's—behavior (Sears, 1957). 
Factors affecting the strength of this drive 
and possible consequents of varying degrees 
of drive strength were the foci of the 
investigation. 

A. word about the measure of identifica- 
tion employed is in order before discussion 
of the results themselves. Since identifica- 
tion drive strength cannot be assessed di- 
rectly, it must be inferred from its presumed | 
consequents. There are, of course, many 
different, and not necessarily closely related, 
consequents of strong identification with 
the like-sex parent. Hence selection of 
indices or criteria of identification must be. 


| 
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toa large degree, arbitrary. It was recog- 
nized from the outset that the single 
criterion used in the present study— 
masculinity of interests in adolescent boys 
—might be a highly restricted, fallible, or 
inadequate one for several reasons. For 
one thing, the primary concern was with 
sex-typing of behavior and personality or 
what might be labeled "psychological mas- 
culinity," conceived as a general character- 
istic; yet there was no available evidence 
that masculinity of interests was positively 
related to other aspects of masculine sex- 
typing. If the MF score proved to be 
statistically independent of other indices of 
psychological masculinity, it would be 
judged to have little or no value as a 
measure of sex-typing or sex-role identifi- 
cation. Fortunately, however, the data of 
the present study—especially those pertain- 
ing to Hypothesis 2—strongly suggest that 
this was not the case. That is, masculinity 
of interests, as evaluated here, was in fact 
significantly associated with other measures 
(tests and ratings) of masculine personality 
traits and behavior. In this sense, it may 
be regarded as an acceptable criterion of 
masculine sex-typing or male role identifi- 
cation. (This point will be discussed at 
greater length below.) 

The Strong Vocational Interest Blank 
MF score may be inadequate as a criterion 
of masculine identification in another sense, 
however. The extent of the individual’s 
masculinity of interests may be strongly 
affected by factors that are theoretically 
irrelevant to the process of identification. 


- For example, there is at least a suggestion 


in the data of the present study that posses- 
sion of a typically “masculine physique” 
and “good musculature” are associated with 
high masculinity of interests. From this 
it might be inferred that the acquisition of 


_ sex-typed interests may be facilitated by 


certain simple physical—presumably consti- 
tutional—factors such as masculinity of 
appearance, a variable which, at least in 
theory, would be independent of the 
Strength of the identification drive. 

Tt is also possible that, in some cases, 
highly developed masculine interests are 


Not at all indicative of deep-seated identifi- 


cation with the male role. On the contrary, 
these interests may be compensatory or de- 
fensive reactions to a sense of deficiency 
in the development of either masculine 
physique, or sex-appropriate personality 
characteristics, or overt behavior. Certainly 
in such cases masculinity of interests would 
be a fallible index of strength of identifi- 
cation. This kind of fallibility in the cri- 
terion measure might reasonably be expected 
to have the effect of attenuating the "true" 
relationship between psychological mas- 
culinity or sex-typing of behavior, on the 
one hand, and variables relevant to the 
identification process, on the other. If the 
designations of high and low masculinity 
groups had been based on several criteria 
(e.g., masculinity of interests together with 
high ratings in male characteristics and 
masculine physique) rather than on a single 
one, some of the predicted relationships 
might have been strengthened. 

Nevertheless, in spite of the fact that 
the relationships obtained were probably 
minimal ones, the number of statistically 
significant associations between the mas- 
culinity measure and variables related {о 
identification with the father was impres- 
sive. There was, for example, substantial 
support for the first hypothesis dealing with 
the familial determinants of identification. 
Specifically, -it was postulated that the 
strength of the boy's masculine identification 
would vary with the extent to which he 
regards his interactions with his father as 
nurturant and rewarding (essentially the 
developmental identification hypothesis). 
This confirmation of the hypothesis further 
corroborates the results of several other 
studies that indicate that favorable attitudes 
toward, or treatment by, the father are re- 
lated to other operational indices of strength 
of identification. Thus, it has been shown 
that five-year-old boys who are accepted 
by their fathers tend to manifest high levels 
of conscience development, a major outcome 
of the identification process (Sears, Mac- 
coby, & Levin, 1957), and those strongly 
identifed with their fathers tend to display 
more sex-typed aggressive behavior (Levin 
& Sears, 1956). In the same age group, 
another key attribute of sex-role behavior, 
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the adoption of masculine interests, appears 
to be related to both the boys' perceptions 
of their fathers as warm and nurturant 
(Mussen & Distler, 1959) and to actual 
affectionate treatment by their fathers, as 
well as to high levels of superego develop- 
ment (Mussen & Distler, 1960). Marked 
resemblances between fathers' and their 
adolescent sons' responses to attitude and 
belief statements taken from the CPI are 
related to boys' favorable perceptions of 
fathers and to greater masculinity as meas- 
ured by the CPI Femininity scale (Payne & 
Mussen, 1956). In sum, the findings related 
to the first hypothesis of the present study 
appear to be entirely consonant with those 
of other studies that lend support to the 
developmental identification hypothesis. The 
absence of significant relationships between 
variables concerned with perception of the 
mother and degree of masculinity of inter- 
ests and attitudes is also consistent with 
findings of other studies ( Mussen & Distler, 
1959). Under ordinary circumstances these 
masculine qualities must be transmitted by 
the father rather than by the mother; 
hence they are more directly affected by the 
nature of the father-son relationships than 
they are by mother-son interactions. 

The major significance of this reaffirma- 
tion of results of earlier research lies in its 
extension of the range of generalizability 
or "ecological validity" (Brunswik, 1947) 
of the developmental identification hypothe- 
sis. Considered together, the data analyses 
of this and other studies demonstrate that 
this hypothesis applies at several age levels 
(during adolescence as well as in earlier 
childhood) and when strikingly different 
measures of identification—e.g., doll-play 
aggression or masculinity measured by the 
IT Scale or the Strong Vocational Interest 
Blank—are used. 

It should be emphasized, however, that 
these results bear directly only on later or 
continued identification. The data provide 
little information about the genesis or 
earliest determinants of identification. It is 
quite possible, for example, that the initia- 
tion of the process depends on factors that 
are quite different from the ones involved 
in maintaining it afterwards. The finding 


that the boy's later identification with his 
father is founded on positive, affectionate 
relations with him, as the developmental 
hypothesis maintains, seems reconcilable, at 
least in theory, with the notion that this 
identification may originate in other ways. 
Thus it is possible that the boy's fears and 
anxieties related to his hostility toward his 
father may be the most crucial variables 
underlying the boy's first masculine identifi- 
cation, as the psychoanalytic or defensive 
theory of identification maintains. Or the 
earliest identification with the father may 
be based on frequent and intensive inter- 
actions with him, especially if he is regarded 
as powerful, as role theory holds. Some 
evidence favorable to all three hypotheses— 
the developmental, defensive, and role 
theory—was found in the author's investi- 
gation of the familial antecedents of mas- 
culinity in five-year-old boys, a group 
presumably much closer to the beginnings 
of their father-identifications. On the basis 
of the data of that study it was concluded 
that those who formed substantial male 
identifications by the age of five “view their 
fathers as powerful sources of both reward 
and punishment." This is “in accordance 
with the role theory, which maintains that 
the child is most likely to assimilate the 
role of an individual with whom he has 
intensive interactions, especially if this in- 
dividual is powerful" (Mussen & Distler, 
1959, p. 356). 

Unlike these data on younger boys, the 
results of the present study offered no 
support for the defensive identification 
hypothesis. The evidence may be inter- 
preted as supportive of role theory con- 
ceptions of identification, however, if the 
tenable assumption is made that perceptions 
of the father as highly affectionate imply 
that the child experiences more frequent 
and more intense interactions with him. 

This suggests an hypothesis about the 
maintenance or continuation of the boy's 
identification with his father, probably lead- 
ing to progressively stronger and more in- 
clusive adoptions of behavior appropriate 
to the male role. This hypothesis postulates 
that, regardless of how the identification 
mechanism is generated, substantial later 
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sex-role identification is dependent upon 
warm, affectionate relationships between 


. father and son. Conceptualized in terms of 


general behavior theory, it is held that the 
boy's early imitation of his father's be- 
havior is likely to elicit more affection from 
the father: i.e., this behavior is apt to be 
reinforced relatively frequently and con- 
sistently. Father-replicative responses ac- 
quire greater habit strength, and the boy 
imitates his father more, thus adopting 
more sex-appropriate role behavior. 

It should be noted that this conceptualiza- 
tion is not necessarily incompatible with the 
Freudian notions of the Oedipal origin of 
identification. According to psychoanalytic 
theory, the boy's relationships with his 
father are likely to become considerably less 
ambivalent and more positive after resolu- 
tion of the Oedipus complex: ie., while 
early identification is presumably based on 
hostile attitudes towards the father and 
accompanying anxiety, later identifications 
may be associated with changed, positive 
attitudes. There is no important contradic- 
tion between this conception of identifica- 
tion and the hypothesis proposed here. 
Unfortunately, however, the present data 
do not provide bases for confirmation or 
refutation of any hypotheses dealing with 
both the origin and development of the 
process of identification. 

Previous researches on the identification 
process and its correlates have typically em- 
ployed single operational measures as the 
criterion of strength of identification. Ex- 
amples include doll-play aggression, con- 
science development, parent-child similarity 
of response to a questionnaire, or an index 
of masculinity. The relationships between 
the criterion employed and other possible 
measures of identification have not generally 
been studied systematically, although some 
exceptions to this can be cited. For ex- 
ample, in an earlier study, the author 
showed that high masculinity among five- 
year-olds, assessed by means of semiprojec- 
tive tests, was associated with measures of 
conscience development (Mussen & Distler, 
1960), and Levin and Sears (1956), using 
an index of conscience development as their 
Operational measure of strength of identifi- 


cation, found that boys highly identified 
with their fathers displayed more aggres- 
sion, a presumably masculine characteristic, 
in doll-play. 

Obviously, however, if sex-role behavior 
is a meaningful concept, it must refer not 
to any single dimension of behavior or 
restricted set of sex-typed attitudes, but 
to a unified, coherent complex of reactions, 
characteristics, attitudes, and values. In 
order to make any generalizations about 
the adoption of sex-appropriate behavior, 
there must be some evidence that the opera- 
tional measure employed is representative : 
ie, that the criterion is, in fact, signifi- 
cantly related to other important aspects 
of behavior comprising the sex role. In 
other words, if the process of identification 
is a major determinant of the acquisition 
of appropriate sex-typed behavior, its re- 
sults must be demonstrable in a pattern 
of interrelated behaviors and attitudes. 

The data of the present study help to 
clarify the range of behaviors—particularly 
those appropriate to the male sex role— 
acquired by the boy through identification 
with his father. The findings bearing on the 
second hypothesis, which was focused on 
the interrelationships among components of 
male sex role behavior, constituted at least 
partial confirmation of that hypothesis. It 
was demonstrated that adolescent boys with 
highly appropriate sex-typed interests 
possessed more than an average degree of 
self-confidence, an instrumental quality, 
and, according to adult observers, generally 
reacted in more masculine ways. Further 
indirect evidence that strong identification 
implies the adoption of a pattern of sex- 
role behaviors, rather than only a few 
isolated segments, is provided by the finding 
that expressive emotional traits, consonant 
with the female role—e.g., dependency, 
gregariousness, strong emphasis on socia- 
bility and recognition—are relatively lack- 
ing or poorly developed among the subjects 
with highly masculine interests. These latter 
qualities are more characteristic of the boys 
with relatively feminine interests, i.e., boys 
who were more highly identified with the 
female role. 
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- These findings related to the second hy- 

pothesis raise a number of interesting prob- 
lems and give rise to some further specula- 
tions about the acquisition of appropriate 
sex-role behavior and the extent to which 
this involves a coherent or integrated 
process. It is noteworthy that, contrary 
to the predictions generated by the second 
hypothesis, certain motivational and be- 
havorial characteristics generally regarded 
as distinctive elements of masculine be- 
havior—e.g., high achievement and aggres- 
sion needs—were not found to be more 
typical of those with highly masculine inter- 
ests than of those low in masculinity of 
interests. Perhaps this was due to the 
methods used: i.e., the assessment tech- 
niques may not have been sufficiently sensi- 
tive to detect differences in these qualities, 
although in reality they existed. 

Other plausible explanatory hypotheses 
may also be proposed, however. Possibly 
certain characteristics are very clearly de- 
fined as aspects of the male role and, 
because of this, all boys are under powerful 
social pressure from social institutions and 
agencies—family, peer groups, school, and 
mass media—to behave in ways consonant 
with these cultural stereotypes of that role. 
The extent to which boys in late adoles- 
cence, after many years of being subjected 
to these pressures, will incorporate these 
obvious characteristics of masculinity, may 
be strongly influenced by many factors other 
than the strength of identification with the 
male role. Hence, the possible effects of 
different degrees of masculine identification 
on the development of such motives as 
aggression and achievement may be ob- 
scured. To cite an illustration of a kind 
frequently found in clinics and counseling 
centers, a boy who has not acquired strongly 
masculine interests or instrumental char- 
acteristics such as self-confidence may, in 
response to pressures from peers and the 
School, become highly aggressive or strive 
extremely hard to accomplish socially ap- 
proved goals. Or, at the other extreme, a 
boy who is basically masculine in interests 
and attitudes may offer strong resistance to 
social pressures to increase his achievement 
motivation. 


The fact that highly masculine boys. 
tended to be relatively lacking in feminine 
characteristics—or conversely, that boys 
low in masculinity possess female traits to 
a more marked degree—suggests another 
hypothesis about the development of male 
role behavior. It may be postulated that 
this development depends not only on the 
adoption of certain masculine overt reac- 
tions, attitudes, and motivations but also, 
concomitantly, on the elimination or extinc- 
tion of responses inappropriate to this sex, 
role or appropriate to the opposite one, 
Infants of both sexes undoubtedly formi 
their earliest identifications with their 
mothers and consequently, for a while at 
least, they begin to adopt their mothers’ 
(ie, feminine) behavior. After a few 
years, boys presumably shift their identifi- 
cations to their fathers, although girls gen- 
erally continue to identify with their 
mothers. 

For the boys, this shift must necessarily! 
entail some reduction of strength of identi- 
fication with the mother, and, consequently, 
the weakening of the habit strength of 
responses duplicating her behavior. This 
would be manifested by the diminution or 
relinquishment of the characteristics and| 
reactions that constitute the female sex 
role. In brief, boys who were highly identi- 
fied with their fathers would be expected 
to display relatively little psychological 
femininity, while boys who failed to identify 
strongly with this parent would not emulate] 
his responses and would therefore reveal| 
more feminine characteristics. The data of 
the present study seem consonant with this. 
reasoning. In terms of general behavior| 
theory, the adoption of male role behavior 
involves the acquisition and increased habit 
strength of male sex role characteristics 
together with the abandonment or extinction | 
—or weakened habit strength—of those 
reactions which are culturally defined as: 
feminine. Conceptualized in role theory 
terms, the development of sex-typed be- 
havior depends upon the combined activities 
of imitating the male role components and, 
to a significant degree, eliminating responses, 
modeled after the mother’s. Among adoles- 
cent boys, relative femininity may be at- 


general 
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tributable to failure to extinguish early- 
developed female reactions or to failure to 
acquire more masculine responses, or both. 

Analysis of the data relevant to Hy- 
pothesis 3, which dealt with the subjects' 
adjustment during adolescence, 
makes it clear that those who developed 
highly masculine interests surpassed their 
peers with relatively feminine interests in 
measures of emotional stability. As had 
been predicted on the basis of the hy- 
pothesis, the former group ranked higher 
than the latter in several test and rating 
variables indicative of emotional maturity 
and personal adequacy. Thus, the highly 
masculine boys scored higher than the other 
group on the total adjustment score of the 
UCAI which had been specially constructed 
for this research program. In addition, they 
were rated higher by adult observers in 
characteristics such as carefree, contented, 
relaxed, happy, calm, and “smoothness” in 
social functioning, all of which may be 
evidences of underlying feelings of security. 

The last finding on “smoothness” of so- 
cial functioning may also have some impli- 
cations for the interpretations of other 
group differences. For example, in view 
of this result, it hardly seems likely that 
the relatively low level of social orientation 
among the highly masculine adolescent 
boys reflects ineptitude or awkwardness in 
Social relationships. Instead, their low 
ratings in gregariousness and sociability 
may represent, as the hypothesis proposed 
above suggests, the extinction of responses 
generally considered to be aspects of female 
role behavior. Furthermore, the relatively 
low standing of the boys low in masculinity 
of interests in "smoothness in social func- 
tioning," considered in the light of their 
high levels of outgoingness and social 
initiative, may be interpreted to mean that 
their sociability is essentially of an im- 
mature sort, perhaps reflecting their atten- 
tion-getting and dependency needs rather 
than skill in the establishment of mature 
relations with others. 

While there is obviously a substantial 
association between strong masculine identi- 
fication and good emotional adjustment, 
it is not possible to specify from the data 


which is antecedent and which is conse- 
quent. That is, the correlations may be 
interpreted to mean that achieving a high 
degree of identification with one’s own sex 
role is conducive to emotional stability or, 
alternatively, that personal adequacy facili- 
tates adoption of appropriate sex role 
behavior. As a third alternative, it is possi- 
ble that among adolescent boys, high 
masculine identification and good adjust- 
ment are associated with each other simply 
because both stem from the same source, 
namely, favorable father-son relationships, 

The present data do not provide answers 
to many of the problems of the develop- 
ment and maintenance of an identification. 
However, we would hazard a guess—a 
tentative explanatory hypothesis that seems 
consonant with many of the data—about 
the sequence of events relating masculinity 
and general adjustment. If the boy finds 
that his father is affectionate and reward- 
ing, acts performed by the father acquire 
secondary reward value. When he emulates 
the father’s behavior, the boy rewards him- 
self and a secondary motivation to imitate 
the father develops and, with further ex- 
perience of rewards following imitation of 
that parent, increases in strength. Con- 
sequently, the child makes more imitative 
responses which are not only self-reward- 
ing, but are also likely to be reinforced 
by the father. The response “acting like 
father” gains great habit strength, the boy 
progressively assumes more and more of 
the father’s attitudes, interests, and reac- 
tions and, thus, more behavior appropriate 
to his sex role. Hence, he becomes highly 
masculine. 

In general, adults and peers expect the 
adolescent boy to be manly and to behave 
in accordance with the cultural specifica- 
tions for his sex role. Boys who adopt 
masculine behavior and characteristics, and 
thus fulfill the general social expectations, 
are likely to encounter social-psychological 
milieux—including attitudes of acceptance 
and favorable treatment by peers and adults 
—which are conducive to the establishment 
of positive self-concepts, personal security, 
and feelings of adequacy, or, in short, to 
good psychological adjustment. 
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The available evidence is not adequate 
for either confirmation or disproof of the 
validity of this hypothesized sequence of 
events. The interpretation seems particu- 
larly plausible, however, since it is consistent 
with much of the theory regarding the 
determinants of identification, with clinical 
observation, and with the present findings 
and those reported by the author and others 
in earlier studies (Cava & Rausch, 1952; 
Payne & Mussen, 1956; Sopchak, 1952). 

Since the association between high levels 
of masculinity and personal security during 
adolescence was found to be a substantial 
one, and since good adjustment at this 
time is thought to provide a foundation 
for subsequent emotional stability, it would 
be expected that highly masculine boys 
should become well-adjusted adults. Hence, 
it was surprising to find that the tests of 
Hypothesis 4, dealing with the long-range 
consequents of different levels of adolescent 
masculinity, yielded inconclusive—and some- 
times seemingly contradictory—results. 

The boys who had been highly masculine 
adolescents tended, as adults, to be highly 
masculine in attitudes and beliefs and to 
possess more than an average degree of ego 
control. However, at the later period, they 
appeared to be relatively lacking in Do, CS, 
and SA (as measured by CPI scales) and 
scored high in abasement needs (EPPS). 
On the basis of these last findings, these 
men might readily be described as poorly 
adjusted and inadequate individuals, strik- 
ingly changed from what they had been 
during adolescence. 

Some of the apparent inconsistency or 
lack of continuity between adolescent and 
adult personality structure may be at- 
tributed, as noted earlier, to the personality 
assessment devices employed. For example, 
the CPI Do and CS scales undoubtedly 
yield measures of characteristics which are, 
to a large extent, instrumental. At the same 
time, however, these scales are heavily 
weighted with emotional-expressive vari- 
ables, such as sociability and gregariousness, 
in which the highly masculine adolescents 
had been rated relatively low. Therefore, 
it is conceivable that the adult low scores 
in the CPI SA, Do, and CS scales, achieved 


by those whose adolescent interests had 
been highly masculine, demonstrate under- 
lying predispositions consistent with—and 
perhaps reflecting the constancy of—the 
subjects' adolescent personality traits and 
social orientations. 

Analogously, high scores in need Abase- 
ment in the highly masculine group may 
reflect feelings of inadequacy and inferior- 
ity, qualities which are decidedly contra- 
indicative of good adjustment. On the other 
hand, these scores may be manifestations of 
the ability to face personal problems and 
shortcomings, an ability that may depend 
on the possession of basic feelings of 
security and ego strength. Interpreted їп 
this way, high scores in need Abasement 
may be evidence of healthy adjustment. 

In essence, the immediately preceding 
discussion argues that the present findings 
do not necessarily contradict the fourth 
hypothesis. In fact, aspects of the data— 
especially those dealing with ego control 
and need for abasement—may be consid- 
ered supportive of it. Since most of the 
Scales differentiating the two groups of 
adults are obviously subject to diverse, 
sometimes diametrically opposed, interpre- 
tations, it is not possible to assess the 
validity of the hypothesis from these data. 

The differences in the adult test scores 


of the two groups of subjects may be ex- 
plicable in other ways, too. For : 
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assuming that the adult tests yield valid 
measures of characteristics such as self- 
confidence, it might be hypothesized that 
in many cases there are radical shifts in 
personality structure after the adolescent 
period. Thus, it could be maintained that, 
as a consequence of their personal and 
social characteristics—especially their s0- 
ciability, friendliness, and outgoingness— 
the low masculine group may develop 
greater competence in interpersonal rela- 
tionships and may achieve considerable 
social success in young adulthood, resulting 
in markedly increased self-confidence, self- 
acceptance, and social ascendance. These 
characteristics are likely to be strongly 
rewarded and to acquire greater habit 
strength as the subjects become older, 50 
that by the time they are in their early 


thirties, they achieve high scores in scales 
measuring “poise, ascendancy, and self 
assurance." 

Conversely, the superior personal adjust- 
ments of adolescent boys with highly mas- 
- culine interests may deteriorate in adulthood. 
Social initiative and social participation 
seemed to be relatively unimportant to these 
boys, and while this did not appear to 
detract from their emotional adjustment 
or personal security during adolescence, it 
may have had different long-range con- 


sequences. If it is true that young adult 
: are expected to be more socially ori- 


ented and gregarious than adolescent boys 
are, the highly masculine subjects, having 
failed to acquire these attributes, may be 
at a disadvantage in many social, educa- 
tional, and vocational situations. This in 
turn may produce, as the test scores sug- 
gest, weakened self-confidence and self- 
acceptance, decreased ability to assume or 
handle leadership or dominance roles, and 
increased feelings of inadequacy. 
In brief, this hypothesis maintains that 
the linkages between high masculinity and 
good adjustment, on the one hand, and 
between low masculinity and poor adjust- 
ment, on the other, are temporary and 
Subject to major changes after adolescence. 
Stated in other terms, it postulates that high 
masculinity of interests, while conducive 
to security and adequate personal adjust- 
| ment during adolescence, may have deleteri- 
ous subsequent effects. Conversely, low 
masculinity of interests during adolescence, 
although fostering emotional instability and 
maladjustment at that time, may have some 
favorable consequences with respect to 
adult social effectiveness and attitudes to- 
wards oneself. 
There is a third possible type of hy- 
pothesis that might help explain these data 
on the adult adjustments of the two mas- 
culinity groups. Perhaps the adolescent 
E boys who were low in masculinity developed 
| powerful ego defense mechanisms, such as 
denial and compensation, to cope with 
underlying feelings of insecurity and in- 
dequacy. After a long period of continued 
‘Practice, these defense mechanisms prob- 
ably became stronger, exerting important 
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influences on ways of responding to 
structured personality tests. The adult test 
scores of those who had been low in 
masculinity during adolescence may, there- 
fore, indicate that they are well-adjusted, 
relatively dominant and self-accepting men 
with high capacity for status, relatively 
few feelings of inadequacy, and little given 
to self-criticism. This picture may be in- 
accurate or very superficial, reflecting de- 
fenses rather than basic personality struc- 
ture. By contrast, those who were highly 
masculine and better adjusted as adolescents 
may have less need to develop these kinds 
of defense mechanisms and, consequently, 
may be more likely to respond to test items 
more directly and frankly. As a result, 
their test profiles may suggest that they 
have achieved less adequate adjustments. 
In fact, however, they may simply demon- 
strate less defensiveness rather than less 
emotional stability than the other group. 

To summarize, we have considered three 
vastly different, but plausible, hypotheses 
that might help explain the present findings 
on the adult adjustments of subjects who 
were high and low in masculinity during 
adolescence. The first explanation main- 
tains, in effect, that the obtained group 
differences show a consistency over time 
(i.e., from adolescence to adulthood) in the 
personality patterns of the two groups. 
Some of the data may also be interpreted 
as indicating the continued good adjustment 
of boys who had been highly masculine 
during adolescence. If this explanation of 
the findings is valid, Hypothesis 4 cannot 
be refuted. According to the second ex- 
planatory hypothesis, the findings do not 
confirm Hypothesis 4 and the group differ- 
ences do, in fact, reflect marked changes in 
adjustment—essentially a reversal of the 
adjustment rankings of the two groups— 
after adolescence. The third explanatory 
hypothesis postulates that the favorable 
personality test performance of those who: 
had previously been low in masculinity 
reflects, not basic emotional security and 
adequate adjustment, but the effectiveness 
of their defense mechanisms. 

Clearly, these are vastly different, and to 
some €xtetity;contradictory explanatory hy- 
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potheses for the obtained results. On the 
basis of the available data, however, it is 
impossible to evaluate their relative merits. 
Intensive personality studies of these sub- 
jects are now being carried out at the 
Institute of Human Development of the 
University of California, Berkeley. The 
results of these studies will undoubtedly 
help to clarify the present findings and 
contribute immeasurably to the understand- 
ing of the enduring consequents of appro- 
priate and inappropriate sex-typing of 
behavior during adolescence. 


SUMMARY 


Several kinds of data collected in con- 
nection with the University of California 
Adolescent Growth Study were used in 
testing four hypotheses dealing, first, with 
adolescent boys' attitudes towards their 
fathers as antecedents of masculine identi- 
fication and, secondly, with the consequents 
of different degrees of this identification 
on personality and adjustment during 
adolescence and early adulthood (early 
thirties). The criterion of the strength of 
the subject’s male role identification— 
considered in this study to be a consequent 
of identification with the father—was the 
degree to which his interests were appro- 
priately sex-typed, as measured by the MF 
score derived from the Strong Vocational 
Interest Blank which was administered to 
the subjects during their senior year in high 
school. 

The four hypotheses may be summarized 
as follows: 

1. The first hypothesis, based on 
Mowrer’s “developmental hypothesis of 
identification,” states that adolescent boys 
whose interests are strongly and appropri- 
ately sex-typed regard their relationships 
with their fathers as favorable and re- 
warding, while boys with more feminine 
interests are less likely to regard their 
interactions with their fathers in this way. 

2. The second hypothesis maintained 
that, among adolescent boys, those who 
acquire highly masculine interests will 
possess more strongly developed personal 


qualities that are considered to be charac-| 
teristic of males in our culture. Those 
whose interest patterns are relatively 
feminine are also more likely to manifest 
more feminine personal and social charac- 
teristics. 

3. According to this hypothesis, adoles- 
cent boys who are strongly identified with 
the male sex role are more likely to be 
more stable emotionally and better adjusted 
socially than boys who are low in mas- 
culinity. 

4. Assuming that good personal adjust- 
ment during adolescence paves the way for 
subsequent psychological well-being, it is 
postulated that a high degree of appropriati 
sex-typing during adolescence will be more 
closely related to adequate personal ani 
social adjustment in adulthood than will a 
low degree of masculine sex-typing during 
this period. 

Tests of these hypotheses inyolved two 
contrasting groups, drawn from a popula- 
tion of 68 adolescent boys who took the 
vocational interest test during their senior 
year in high school. The groups consisted 
of the 20 subjects with the most masculine 
interest scores and the 19 with the most 
feminine (least masculine) scores. The 
basic data consisted of: (a) a series of 
personality tests, some administered during 
the senior year in high school and others 
approximately 16 years later; (b) ratings 
of drives, appearance, personality, and 
social behavior made by trained observers 
who were members of the Adolescent 
Growth Study staff; and (c) results 
of sociometric questionnaires (Reputation 
Tests) answered by the subjects’ class- 
mates. 

Analysis of the responses to the TAT, 
administered during the last year of high 
school, showed that, as had been predicted 
on the basis of the first hypothesis, a sig- 
nificantly greater proportion of the highly 
masculine boys than of the other group 
portrayed their relationships with еі 
fathers as positive and rewarding. More: 
over, fewer of the former than of the latte 
group told stories in which fathers behaved 
in restrictive or punitive ways towards thei 
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sons. These findings were interpreted as 
confirmation of Hypothesis 1, which is 
essentially the developmental identification 
hypothesis. It was noted, however, that 
these results refer only to the later phases— 
the maintenance and continuation—of the 
identification process. The data give no 
information about its genesis or very early 
determinants. 

A number of findings supported the 
second hypothesis which was tested by com- 
paring the two groups in many TAT, 
rating, and sociometric variables that had 
been judged to be congruent with character- 
istics ascribable, in Parsonian terms, to 
ither instrumental (male) or emotional- 
pressive (female) roles. As predicted, 
igh masculinity of interests tended to be 
ssociated with instrumental traits, while 
relatively feminine interests were linked 
with expressive characteristics. Thus, the 
TAT responses of the highly masculine 
boys gave more evidence of positive self- 
concepts and self-confidence, instrumental 
traits, and these boys were rated higher in 
asculine behavior than the other group. 
he group with relatively feminine inter- 
ests, on the other hand, were rated higher 
in variables corresponding to emotional- 
expressive characteristics such as depend- 
ency, social initiative, social activity, and 
attention-seeking, as well as in drives for 
Succorance (dependency), social ties, and 
ecognition. Peers also described this group 
in terms indicating strong development of 
expressive traits: i.e., the boys low in mas- 
culinity were regarded as socially active, 
dependent, and attention-seeking. АП this 
_ evidence, collected from different sources, 
generally confirms the second hypothesis 
_ and demonstrates that strong identification 
with the male role is manifested in the 
adoption of a coherent set, or pattern, of 
. Sex role behaviors, rather than in the 
acquisition of a few isolated sex-appropriate 
1 characteristics. 

Judging from data derived from the Uni- 
ersity of California Adjustment Inventory, 
Wo series of ratings (based on behavior 


re 


at a clubhouse and at the Institute of 
Human Development), and sociometric 
questionnaires ( Reputation Tests), the third 
hypothesis was substantially confirmed. The 
highly masculine group exceeded the other 
in overall adjustment, as judged from the 
adjustment inventory, and the subjects in 
the former group were rated by staff ob- 
Servers as more carefree, more contented, 
more relaxed, more exuberant, happier, 
calmer, and smoother in social functioning. 
Peers considered the boys with relatively 
feminine interests to be more restless, i.e., 
to manifest more overt signs of tension. 
In general, it appeared that a high degree 
of masculine identification during adoles- 
cence tends to be associated with personal 
adequacy and emotional stability during 
that period. 

The tests used when subjects became 
adults, the California Psychological Inven- 
tory and the Edwards Personal Prefer- 
ence Schedule, did not yield data adequate 
for the confirmation or rejection of the 
fourth hypothesis. According to these tests, 
boys who had more masculine interests 
during adolescence also appeared to be more 
masculine than the others in adult interests 
and attitudes. They also gave evidence of 
greater ego-control (internalized control 
and ability to delay gratification). However, 
they scored lower than the less masculine 
group in measures of dominance, self- 
acceptance, and capacity for status and high 
in need for abasement. In general, these 
results do not substantiate Hypothesis 4, 
but, for several reasons, they cannot be 
regarded as providing adequate bases for 
rejecting it. Several plausible hypotheses, 
attempting to explain the findings on the 
adult status of adolescent boys of different 
degrees of masculinity of interests, have 
been proposed. These are not verifiable or 
refutable on the basis of the present data, 
but more intensive studies of the adult 
personalities of the subjects may yield data 
that permit careful evaluation of these 
hypotheses. Such studies are now being con- 
ducted at the Institute of Human. Develop- 
ment at the University of California. 
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oem exists in the field of child psy- 
chiatry today a general conviction that 
unless the milieu of a relationship that is 
central to a child can undergo favorable 
change, the most skillful therapeutic inter- 
vention is of little avail. This conviction 
“раз directed the interests of many clinicians 
oward investigating the dynamics of pri- 
пагу group interaction and toward concept- 
PPalizing the pathology of the child within 
the psychosocial structure of the family 
(Аскегућап, 1958; Ackerman '& Behrens, 
1956: ой, 1954). Among a majority of 
child guidance practitioners, it is axiomatic 
| that psychodiagnosis has become the sine 
qua non for planning effective treatment. 
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AN ASSESSMENT OF THE DIAGNOSTIC PROCESS 
IN A CHILD GUIDANCE SETTING: 


PHILIP A. MARKS? 
Kansas University Medical Center 


Considering the efforts recently extended 
in assessing the validity of currently popu- 
lar diagnostic techniques, it is surprising 
that little, if any, attention has been directed 
toward evaluating the diagnostic process 
itself. It is well known, for example, that 
in the practical situation psychologists typi- 


.cally employ several kinds of psychometric 


data together with social history, interview, 
and observation, rather than relying solely 
upon the blind analysis of a single source of 
information. Yet, the efficacy of this tradi- 
tional, time consuming approach, is largely 
unknown. 

The present study was designed to inves- 
tigate personality characteristics of parents 
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of children referred to a clinic for psychi- 
atric treatment, and to assess the relative 
efficiency of various sources of data con- 
sidered by child guidance practitioners in 
arriving at decisions. 

The questions investigated were as fol- 
lows: 

1. Are there personality variables which 
distinguish parents of children referred to 
clinics for psychiatric treatment from 
adults of the general nonpsychiatric popula- 
tion? 

2. Is the diagnostic interview, as em- 
ployed in child guidance clinics, the most 
efficient source of information about 
mothers of clinic patients? For example, 
how accurate are personality descriptions 
of mothers derived by judges from blind 
readings of the MMPI? 

3. How efficiently do clinical psycholo- 
gists use their diagnostic time? Аге {һе 
tests and techniques employed by them the 
best suited for providing information about 
clinic patients? For example, how accurate 
are personality descriptions of children de- 
rived by judges from blind readings of the 
MMPI profiles of parents? 

4. Can clinicians derive personality de- 
scriptions of mothers and children which 
will better predict a criterion than stereo- 
type descriptions derived by the same clini- 
cians, and composite stereotype descriptions 
derived by other clinicians? 


HYPOTHESES 


Five major hypotheses were formulated 
to answer the questions stated above. 


Hypothesis I. There are personality 
characteristics which distinguish mothers of 
children referred to clinics for psychiatric 
treatment from adult females of the gen- 
eral nonpsychiatric population. 


Hypothesis II. There are personality 
characteristics which distinguish fathers of 
children referred to clinics for psychiatric 
treatment from adult males of the general 
nonpsychiatric population. 


Hypothesis III. From blind readings of 
. MMPI profiles of mothers, judges can de- 


rive personality descriptions of mothers of 
child guidance patients which will correlate i 
more highly with criterion descriptions de- | 
rived by clinic caseworkers than with diag- | 
nostic descriptions derived by the same 
caseworkers. 

a. Judges’ MMPI-based descriptions of 
mothers will correlate more highly with a | 
criterion than will mother stereotype de- 
scriptions derived by the same judges. 

b. Judges’ MMPI-based descriptions of 
mothers will correlate more highly with a 
criterion than will mother stereotype de- 
scriptions derived by clinic caseworkers. 

c. Caseworkers’ diagnostic descriptions j 
of mothers will correlate more highly with ^ 
a criterion (subsequently derived by the 
same caseworkers) than will mother stereo: 
type descriptions derived by the same сазе- 
workers. ERAN ! 

d. Caseworkers' diagnostic descriptions 
of mothers will correlate more highly with a 
a criterion (subsequently derived by the 
same caseworkers) than will composite 
mother stereotype descriptions derived by 
other clinic caseworkers. 


Hypothesis IV. From blind readings of 
MMPI profiles of parents, judges can Пе 
rive personality descriptions of children re- 
ferred to psychiatric clinics for treatment 
which will correlate more highly with cri- 
terion descriptions derived by clinic thera- 
pists than with diagnostic descriptions de- 
rived by clinic psychologists. 

a. Judges’ descriptions of children, de- 
rived from parent MMPI profile data, will 
correlate more highly with a criterion than „№ 
will child stereotype descriptions derived by [ 
the same judges. 

b. Judges’ descriptions of children, de- © 
rived from parent MMPI profile data, will Ў, 
correlate more highly with a criterion than. | 
will composite child stereotype descriptions | 
derived by clinic psychologists. 


Hypothesis V. From psychological ex- 
amination data, psychologists can derive 
personality descriptions of children referred 
to psychiatric clinics for treatment whic 
will correlate more highly with a criterion 
than will descriptions of the same children. 


ASSESSMENT OF DIAGNOSIS IN CHILD GUIDANCE 3 


‚ derived by judges from blind readings of 
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the MMPI profiles of parents. 


a. Psychologists' diagnostic descriptions 
of children will correlate more highly with 
a criterion than will child stereotype de- 
scriptions derived by the same psycholo- 
gists. 

b. Psychologists’ diagnostic descriptions 
of children will correlate more highly with 
a criterion than will composite child stereo- 
type descriptions derived by other clinic 
psychologists. 


PROCEDURE 


The procedure in the present study was 
designed to obviate a major criticism often 
leveled against clinical research in a service 
setting: namely, that in such research, the 
investigator, under the guise of expediency, 
often contrives spurious methodological 
procedure. He promotes various changes in 
established routine which may implement 
his purpose and fulfill the requirements of 
rigorous experimental design, but which 
are rarely to be duplicated once his data are 
in. While this criticism does not apply to 
all research in such settings, it does apply 
to most research purported to have prag- 
matic implications. Two recent investiga- 
tions designed to assess the validity of diag- 
nostic instruments can be taken as cases in 
point (Little & Shneidman, 1959; Silver- 
man, 1959). The present study, however, 
Necessitates only minimal departure from 
the reality of “what is done” in a service 
setting, while hopefully not succumbing to 
the pitfall of weak or inadequate experi- 
mental design. 


Selection of Cases 


Forty-eight clinic cases (ї.е., 48 mothers and 48 
children), representing 9096 of consecutive refer- 
rals accepted for treatment from March 15, 1958 
to July 31, 1958 comprised the MMPI normative 
group. Of these, 42 cases in which the mother 
and child remained in treatment for a minimum 
of 5 hours and in which the child was free of 
gross organic (cerebral) impairment and/or men- 
tal deficiency, comprised the validation group. The 
Normative and validation samples were presumably 
Tom the same population of cases accepted for 
treatment, However, owing to the small differ- 


ence between groups (six cases), it was not 
deemed feasible to test the assumption statistically. 
Thus, descriptive data are presented for the total 
sample of 48, whereas validation findings, by defi- 
nition, were restricted to the sample of 42, 


Children 


Descriptive data were obtained from the 48 
children in the normative group on the major vari- 
ables of sex, age, intelligence, referral source, and 
referral complaint. Where possible, these were 
compared with similar data from the general clinic 
population (G. R. Patterson, 1955), and from re- 
gional studies of comparable groups (Maas, Kahn, 
& Sumner, 1955; O’Neal & Robins, 1958; Roach, 
Gurrslin, & Hunt, 1958; Stevens, 1954). 

Seventy-five percent of the children were male. 
This finding was consistent with data reported by 
other investigators (Maas et al., 1955; O’Neal & 
Robins, 1958; Roach et al., 1958) indicating a 
tendency toward a disproportionate representation 
of males in child guidance clinic populations. 
Table 1 compares the sex distribution of the 
sample with the sex distribution of the general 
clinic population, Although the proportion of 
males was slightly greater in the sample, the dif- 
ference between proportions did not approach an 
acceptable level of significance. (p = .70). With 
respect to sex, the proportion of males to females 
selected for treatment was representative of the 
sex ratio among unselected clinic referrals. 


The treatment sample of males averaged 10.4 
years of age with a range of 4.0 to 16.9 (median = 
10.0; SD = 3.53) and the females averaged 9.6 
years of age with a range of 42 to 16.7 (median = 
9.5; SD = 4.03). Testing for a difference between 
means did not yield a significant value (¢ = .66). 
Tables 2 and 3 present male and female age- 
grouped distributions for the sample and general 
clinic population. A chi square analysis did not 
yield a probability value significantly greater than 
chance for data in either table. Thus the age- 
grouped distribution of males and females in the 


TABLE 1 


Sex DISTRIBUTION OF TREATMENT SAMPLE 
AND CLINIC POPULATION 


Treatment Clinic 
Sex Sample Population^ 
% % 
Male 75.0 69.9 
Female 25.0 30.1 
Both sexes 100.0 100.0 


^ Clinic population data taken from Minnesota Out-Patient 
Statistical Report of cases terminated (N = 206) by the 
Amherst H. Wilder Clinic during the fiscal year 1956-57. 
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TABLE 2 


AcE DISTRIBUTION оғ MALES OF TREATMENT 
SAMPLE AND CLINIC POPULATION 


Treatment Clinic 

Age Sample Population* 
© 76 

Under 5 5.7 9.8 

5-9 38.8 37.5 

10-13 36.1 33.3 

14-17 19.4 19.4 

All ages 100.0 100.0 
Pie oats a ЕГА M 

a m 

Statistical Report of cases terminated. (ҮЗА Ош Patient 


herst H. Wilder Clinic during the fiscal year 1956-57. 


sample was representative of similar distributions 
among unselected clinic cases. 

During the diagnostic period, 9096 of the chil- 
dren were administered either the Stanford-Binet 
or the Wechsler Intelligence Scale for Children. 
The average IQ for 33 males was 100.7 with a 
range of 71 to 143 (median — 100.0; SD — 16.59). 
The average IQ for 10 females was 102.0 with a 
range of 81 to 139 (median — 1020; SD — 15.18). 
As the data suggest, the sexes were similar in IQ 
distribution. Table 4 presents the results of a 
comparison between the IO distribution of the 
sample and the IQ distribution of children referred 
to three Twin City* child guidance agencies (G. R. 
Patterson, 1955), For computational purposes, the 
IQs were distributed among three categories: be- 


* Minneapolis and St. Paul. 


TABLE 3 


low average (71-90), average (91-110), and aboy 
average (111-145). The average IQ of the samp е | 
was 1010, whereas the average IO of 213 un 
selected agency referrals was 105.6. 4 
The findings presented above and in Table 4 
offer some observations of general interest. Fi 
the IQ distribution of children selected for treat: 
ment closely approximates the theoretical distribu 
tion of the normal population. Second, the IÇ 
distribution among unselected clinic referrals i 
negatively skewed, and the mean of the distribu: 
tion is numerically higher than the mean of 
sample. Unfortunately, the source of these data 
does not provide enough information to test the 
difference statistically. The direction of the differ. 
ence, however, was consistent with findings of 
other investigators (Roach et al, 1958; Stevens, 
1954). Third, the differences between the grouped 
distributions, while not of acceptable statistica 
significance (p = -10), suggest that a dispropor 
tionately large number of children with below " 
average intelligence estimates were selected fo; 
treatment. Although this finding is contrary to 
expectations, it is nevertheless understandable in 
view of the major source of clinic referrals a d 
the most frequently offered referral complaint. As: 
will be reported below, school referrals accounted 


plaint. A general 
these data was that school exigencies, to a large 


TABLE 4 


IQ DISTRIBUTION OF TREATMENT SAMPLE AND Д 
Twin Crrv CHILD GUIDANCE [ 


AGENCIES* 
AGE DISTRIBUTION OF FEMALES oF TREATMENT 
SAMPLE AND CLINIC POPULATION 1 
10 Treatment | Twin City | 
Sample encies 
Treatment Clinic P4 ^ % 
Аре Sample Populations ee SNP Me i] 
% % 
Below average 
(71-90) 25:5 17.4 
Under 5 8.3 16.2 Average 
5-9 50.0 43.5 (91-110) 58.1 50.2 
10-135 25.0 22.6 bove average 
14-17» 16.7 17.7 (111-145) 16.4 32.4 
АП ages 100.0 100.0 Totals 100.0 100.0 
Note.—Chi = .74 with 2 = .70). N hi = 4.83 wi = 
ene population, daia taken í fom Mic А QutPatient — » Data ieten d G. R Poe de Tn City sample f 
Amherst Н. Wilder Clinic during the fecal year 209) ВУ е comprises cases from the Amherst Н cp o 


b Groups 10-13 and 14-17 were combined as recoi ded 
by Siegel (1956) for celis with small Ns. CEDE 


burn Memorial Clinic, the Child Study Department, and the © 
Service, Uni Man 


In-Patient Psychiatric а — 
ded versity of Minnesota | 


“Mt. Zion). 
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most expedient use of psychiatric time. The sec- 
ond involves the known positive correlation be- 
tween intelligence estimates and academic success. 
If one holds that psychotherapy (at best) is a 
learning situation, then children with high intel- 
ligence, ceteris paribus, would be more likely to 
respond than children not so endowed. Suffice it 
to say that intelligence was not the prime criterion 
for selecting the treatment cases in the present 
sample. 

Data pertaining to source of referral are cate- 


gorized in Table 5. With exception of public and 


private agency cases, the categories shown repre- 
sent the major sources reported by other metro- 
politan clinics (Maas et al, 1955; Stevens, 1954). 
Since in most instances, public and private agency 
referrals are made for diagnostic purposes only, 
and since the sample was restricted to cases enter- 
ing treatment, this source was omitted and the 
percentages shown were recomputed from sub- 
totals of the sample distributions. In the absence 
of data for the general clinic population, samples 
{тот regional clinics are presented for compari- 
son. Testing for a difference between groups 
yielded a significant value (p = .01). Thus, the 
four distributions were dissimilar in their propor- 
tion of cases referred by different sources. 

Among the treatment sample, school referrals 
comprised the largest single source of clinic in- 
take, and were proportionately greater than refer- 
rals from any other source among all clinics 
reported. It would seem likely that in larger cities 
(eg, Chicago, New York, and San Francisco) 
where more numerous facilities exist, school per- 


TABLE 5 
Sources OF REFERRAL 
San 
Treat-| Chi- | New | Fran- 
ment | cago* | York?) cisco® | Total 
Source Sam- | (N = | (N = | (N = | (N = 
ple | 200) | 347) | 306) | 901) 
% % % % % 
School 43.4 | 22.7 | 26.1 | 20.6 | 24.6 
Self, 
relative, etc. 23.9 | 34.5 | 31.8 | 30.9 | 31.5 
Physician, 
hospital 23.9 | 26.2 | 25.7 | 38.9 | 30.3 
Courts 8.7 | 16.5 | 16.4 | 9.5 | 13.5 
Totals 99.9 | 99.9 100.0 | 99.9 | 99.9 


Note.—Chi square = 27.20 with 9 dj = .01). 
Cees = d Roach (1958). Based zd ue at the Guidance 
i о. 
Neaken from Maas et al. (1955). Based on intake at 
NX York clinics (Jewish Board of Guardians, New Rochelle, 
Setthside, ‚ White Plains, and Yonkers) and three 
Francisco clinics (Children's Hospital, Langley Porter, and 


sonnel would be less inclined to refer problem 
cases to child guidance practitioners. On the other 
hand, where fewer facilities exist, patient loads 
are usually larger and there should be less avail- 
able time for academic problem cases. It is equally 
probable that certain clinics choose to focus upon 
different problems (e.g, mental retardation, or- 
ganic impairment, schizophrenia). What is ap- 
parent, however, is that clinics differ in their 
orientation and referral policies and that samples 
taken from any one of them are not necessarily 
representative of other agencies. 

A list of symptoms and referral complaints 
which characterized the sample at the time of 
intake appears in Appendix A. The list employed 
was developed by G. R. Patterson (1955) in an 
earlier study of Twin City clinic referrals. The 
incidence ratings reported were based upon de- 
scriptions of children's behavior reported by the 
parents or the referring agency. АП tabulations 
were made by the investigator who read each 
intake report and recorded every symptom and 
complaint appearing therein. Table 6 compares the 
five most frequently reported complaints for chil- 
dren in the sample with their frequency among 
unselected clinic patients. It can be seen from 
Table 6 that the groups yielded a difference that 
was statistically significant beyond the .001 level. 
Table 7 compares the five most frequently re- 
ported complaints for unselected clinic referrals 
with their frequency among the treatment sample. 
Again, a significant difference was found (p —.05). 

Before commenting on these findings, a word of 
caution is necessary. Although first consideration 
was given to objectivity of recording (ie. only 
symptoms and complaints explicity stated were 
tabulated) no reliability estimates were available 
for the ratings. Furthermore, no attempt was 
made to distinguish "major" from "minor" com- 
plaints, and thus the data do not necessarily reflect 


TABLE 6 


INCIDENCE OF Five Most FREQUENT COMPLAINTS 
AMONG CHILDREN IN THE TREATMENT SAMPLE 
WITH CORRESPONDING INCIDENCE AMONG 
THE GENERAL CLINIC POPULATION 


Treatment| Clinic 
Referral Complaint Sample |Population* 
% % 

Unable to achieve in school 45.8 48.5 
Poor peer relationships 25.0 11.8 
Nervous 25.0 8.1 
Sleep disturbances 25.0 .8 
Fearful 25.0 4.4 


Note.—Chi square = 32.20 with 4 df (p = .001). 

з Taken from С. R. Patterson (1955). Data represent an 
unselected group of 136 cases referred to the Amherst H. 
Wilder Clinic. 
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TABLE 7 


INCIDENCE oF Five Most FREQUENT COMPLAINTS 
AMONG CHILDREN IN THE GENERAL CLINIC 
POPULATION WITH CORRESPONDING INCI- 
DENCE AMONG THE TREATMENT SAMPLE 


Clinic Treat- 
Popula- ment 

Referral Complaint tion* Sample 

% % 
Unable to achieve in school 48.5 45.8 
Defiant, nonconforming, 

disobedient 16.9 6.2 
Immature 12:5 2: 
Poor peer relationships 11.8 25.0 
Unable to concentrate 11.8 12.5 


Note.—Chi square = 10.99 with 4 df (р = .05). 

* Taken from С. R, Patterson (1955). Data represent an 
unselected goun of 136 cases referred to the Amherst Н. 
Wilder Clinic. 


the major problems for which the children were 
referred. Most important, it is not known whether 
the data from the two samples were recorded un- 
der similar conditions. It is possible that the differ- 
ence obtained represented variations in recording 
procedure rather than a difference in types of 
complaints presented by children in the groups. 
Having stated this precaution, certain observations 
are noteworthy. The most striking finding is the 
relatively high incidence of school achievement 
problems in both groups. For approximately half 
of all clinic referrals, and for a proportionate 
number of children selected for treatment, "inabil- 
ity to achieve in school” was offered as a present- 
ing complaint. This finding was consistent with 
data showing a disproportionately large number of 
children in treatment referred by schools. Thus, 
where the school served as the major source of 
clinic referrals, school achievement problems con- 
stituted the most frequently offered referral com- 
plaint. It can also be seen from Tables 6 and 7 
that behavior which is most offensive to the cul- 
ture (“defiant,” “nonconforming,” “disobedient,” 
and “immature”), and for which society presum- 
ably makes referrals, is relatively infrequent 
among children selected for treatment. In contrast, 
behavior presenting the most subjective discom- 
fort (“nervous,” “fearful,” and “sleep disturb- 
ances”), has a high incidence among children 
selected for treatment, but was rarely present 
among unselected clinic cases. 


Parents 


Descriptive data were obtained for age, marital 
status, and education of the children’s parents, and 
for socioeconomic status, religion, and racial 
origin, of the children’s families. Where possible, 
these data were compared with norms for the 
general clinic population, and with Minnesota male 


and female standardization data (revised) for the 
MMPI (Hathaway & Briggs, 1957). 

The mothers averaged 37.1 years of age and 
ranged from 27 to 56 (median = 35.0; SD = 7.03), 
and the fathers averaged 40.7 years of age and 
ranged from 26 to 55 (median = 38.0; SD = 4.94), 
In 26% of the families the mother was older than 
the father (mean = 1.5 years; range 1 to 3). 
Table 8 compares age data for the sample with 
age characteristics of the MMPI standardization 
group. It can be seen from Table 8 that the 
sample means and variances of both sexes differ 
significantly. The parent sample was older and 
more homogeneous than adults representative of 
the general Minnesota population. This finding, 
however, was to be expected, The sampling of 
parents precludes a normal age distribution. A 
comparison of the age distribution for both groups 
revealed that 27% of males and 24% of females 
included in the MMPI standardization group were 
classified within the adolescent age range (16 to 
25 years), whereas none of the parents in the 
sample were younger than 26 years of age. 

Thirty-seven of the 48 families studied were 
intact (ie, were not disrupted by death, separa- 
tion, or divorce) at the time of intake. Of the 
remaining 11, 7 were broken due to divorce and 
4 were disrupted by fathers' deaths. 

Educational characteristics of the parent sample 
and the MMPI standardization group are given 
in Table 9. The fathers averaged 10.1 years of 
education and ranged from 8 to 16 (median — 
100; SD — 2.59), and the mothers averaged 10.3 
years of education and ranged from 7 to 17 (me- 
dian — 105; SD — 2.55). Among the MMPI 
group, males averaged 9.7 years of education and 
ranged from 3 to 18 (median = 80; SD = 2.78), 
and females averaged 10.5 years of education and 
ranged from 3 to 18 (median = 10.0; SD = 2.83). 
Although the sample distribution was negatively 
skewed, testing for differences between means and 
variances failed to yield ratios significant at an 
acceptable level of confidence. With respect to 
years of education, the parents in the sample were 


TABLE 8 


AGE CHARACTERISTICS OF PARENT SAMPLE AND 
MMPI SrANDARDIZATION GROUP 


Subjects N |Mean| SD t E 
Males 

Parent 42 | 40.7 | 4.94 

MMPI 226 | 34.3 10.93 |6.16»*4.89** 
Females 

Parent 48 | 37.1 | 7.03 |2.92**2.13* 

MMPI 315 | 34.5 |12.02 


* Significant between the .10 and .05 levels. 
** Significant at the .01 level or beyond. 


> 
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TABLE 9 
EDUCATIONAL CHARACTERISTICS OF PARENT 
SAMPLE AND MMPI STANDARDIZATION 
GROUP 

Subjects N |Mean| SD t F 
Males 

Parent 42 10.1 | 2.59 08 | 1.11 

MMPI 226 9.7 | 2.78 
Females 

Parent 48 | 10.3 | 2.55 04 | 1.23 

MMPI 315 10.5 | 2.83 


representative of adults in the general Minnesota 
population, 

The Minnesota Scale of Paternal Occupations 
(Goodenough & Anderson, 1931), was used as an 
index of family social status. In Table 10, the 
occupational distribution of the parent sample is 
compared with distributions from the clinic popu- 
lation and the MMPI standardization group. Farm- 
ers (Class IV) have been excluded owing to 
their low frequency (.3%) in the clinic population 
and their nonexistence in the other samples. Test- 
ing for a difference between groups yielded a 
value significant at the .001 level The three 
sample distributions did not come from the same 
homogeneous population. 

Table 10 shows that the clinic extends diagnostic 
service to all occupational groups (except Farm- 
ers), but with some underrepresentation of the 
lower class. The fact that 7096 of the families 


TABLE 10 


OccuPATIONAL CHARACTERISTICS OF PARENT 
Samper, CLINIC POPULATION, AND MMPI 
STANDARDIZATION GROUP 


Clinic 
Parent | Popula- | MMPI 
Occupational Groups | Sample | tion* Group 


I Professional 8.3 5.1 7.8 
II Semi-professional| 4.2 14.9 5.2 
ПІ Skilled, clerical, 
business 39.5 15.6 19.4 
V Semi-skilled, 
minor clerical 31.2 32.2 23.0 
VI Slightly-skilled 12.5 18.0 35.9 
VII Day laborers 42 | 14.2 | 8.6 


Note.—Chi square = 75.03 with 10 d = .001). 
* Taken from С. К. Patterson (1955). V oic population data 


lepresent a combined distribution of homogeneous samples 
from three Twin City agencies. 


were representative of Class III and Class V 
occupations, suggests that treatment was a service 
extended primarily to the middle class. This find- 
ing was not in keeping with reports of other 
investigators which show proportionately more 
upper-class families among child guidance refer- 
rals (Maas et al., 1955; Roach et al, 1958), and a 
proportionately greater number of upper-class 
families receiving treatment services (Hollings- 
head & Redlich, 1958). There are undoubtedly 
many interrelated factors which determine a given 
clinic’s case load. Referral source, intake policy, 
and general orientation toward treatment have all 
been mentioned above. It is quite possible that in 
clinics where medical referrals are proportionately 
few, and where school referrals comprise the 
largest single source of intake, the social class 
representation might be expected to approximate 
the normal distribution. 

No data were available pertaining to the re- 
ligious composition of the clinic population. 
Among the families in the treatment sample, 54% 
were Protestant, 29% were Catholic, 5% were 
Jewish, and 12% expressed no preference or their 
denomination was unknown. 

Only one family in the normative group was 
nonwhite as compared with an estimated 5% non- 
white in the clinic population. Since this case 
terminated before reaching the criterion number 
of treatment interviews, 100% of the validation 
sample were Caucasian. 


Clinical Evaluators 


Child Psychologists 


Two staff psychologists and two third-year 
clinical psychologist trainees served as child diag- 
nosticians. They differed markedly in academic 
training, theoretical orientation, years of experi- 
ence, clinical interests, and bias toward various 
tests. In Table 11, the psychologists are identified 


TABLE 11 
Descriptive DATA CONCERNING CHILD 
PSYCHOLOGISTS 
Diagnostic 
Months of Clinical О Sorts 
Experience Children 
Psycholo- (N = 42) 
gist 
Study | Else- 
Clinic | where | Total | XN % 
А 120 | 120 | 240 23 54.8 
B 54 24 78 11 26.2 
се 22 б 28 4 9.5 
D* 12 6 18 4 9.5 


* Third-year clinical psychologist trainee. 
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as A, B, C, and D, and descriptive data are pre- 
sented for each. 

At the time of intake, each family was assigned 
to one of four clinic teams and all diagnostic ap- 
pointments, in addition to the date of the diag- 
nostic conference, were scheduled in advance of 
the family’s first formal clinic contact. The diag- 
nostic period (ie, the time between intake and 
the diagnostic conference) ranged from 13 to 46 
days and averaged 27 days. The number of chil- 
dren assigned to each psychologist varied greatly 
and was ultimately determined by such factors as 
size of case load, preference, and training expedi- 
encies. Consequently, trainees, who were employed 
on a half-time basis only, were assigned fewer 
children than clinic staff. It can be seen from 
Table 11 that over one-half the children were 
assigned to Psychologist A, whereas only one- 
tenth were assigned to Psychologists C and D. 
Moreover, the two staff psychologists evaluated 
81% of the children as compared with 19% 
evaluated by the two psychologist trainees. The 
diagnostic evaluations of children were essentially 
a staff enterprise. 

A description of the tests and techniques used 
by the psychologists is of major importance, espe- 
cially since the accuracy of judgments derived 
from them was the focus of concern. It has been 
mentioned above that a major consideration of the 
design was to adhere to established clinic routine 
throughout the study. Accordingly, psychologists 
were permitted to select whatever instruments they 
desired, or to spend their time however they 
chose. Table 12 presents a list of the tests em- 
ployed and their frequency of use. The data in 
Table 12 reveal several interesting facts about the 


TABLE 12 


PSYCHOLOGICAL TESTS AND TECHNIQUES 
ADMINISTERED TO CHILDREN 


clinical interests and test biases of the clinic psy- 
chologists. First, the Draw-A-Person technique, 
despite recurrent reports of invalidity (e.g., Swen- 
sen, 1957), was the most popular instrument in the 
psychologists’ armamentarium. Perhaps this can 
be justified in part by its professed use as a 
rapport-gaining device, and in part, by its use as 
an estimate of intelligence. Second, infrequently 
used were various projective techniques (e.g, 
TAT, CAT, Blacky pictures) which are presum- 
ably unique in their contribution of dynamic con- 
tent. Third, achievement and intelligence tests 
were commonly used and, with few exceptions, 
were found to complement the Rorschach and 
Draw-A-Person in forming a basic battery. 

The tests were also grouped into the major 
areas of personality, intelligence, and achievement. 
Table 13 gives the source and amount of data 
available to clinic psychologists at the time they 
made their diagnostic judgments. It can be seen 
from Table 13 that the modal psychological exami- 
nation of children consisted of an intelligence esti- 
mate, three personality measures, and in most 
instances, some index of academic progress. For 
all but three children, the examinations seemed 
optimal since they covered three major areas of 
function in addition to providing opportunity for 
observation and interview. Evaluating data for 
the three children who refused testing were ob- 
tained from observation and interview. 

It is important to emphasize that the relative 
accuracy of psychologists’ descriptive and infer- 
ential statements about children, derived from an 
integration of psychometric, interview, and obser- 
vational data (ie. the clinician-as-an-instrument), 
was the variable under investigation. The diag- 
nostic process was a matter of concern only in the 
degree to which the psychologist contributed to it. 

Following completion of the psychological ex- 
amination and prior to the diagnostic conference, 
the psychologist compiled a Q-sort personality 


(N = 39)" description of the child. Each Q-sort description 
E a structured and comprehensive psy- 
ologi: report and served as the basis for a 
Instrument N % ^ comparative evaluation with similar descriptions 
independently derived from the MMPI. 
Draw-A-Person 34 87.1 
Rorschach 30 76.9 
Reading Achievement? 29 74.3 
Stanford Binet (Forms L or M) 28 7.7 АЗИЗ 
Arithmetic Achievement^ 20 51.2 Source AND AMOUNT OF DATA AVAILABLE TO 
Spelling Achievement? 16 41.0 PSYCHOLOGISTS FOR MAKING CLINICAL 
WISC 12 30.7 JUDGMENTS 
Sentence Completion 11 28.2 (N = 39)* 
Bender-Gestalt 9 23.1 
Make-A-World 8 20.5 
Fables, 3 Wishes, etc. 4 10.2 Source N 76 
Scy = 4 10.2 
ас! 'ictures 4 10.2 Personality? 39 100.0 
CAT f 3 7.7 Intelligence 38 97.4 
Ammons Picture Vocabulary 3 7.7 Achievement 32 82.0 
* Excluding three children who refused testing. * Exclu 


b Wide-Range Achievement Test. 


ding three children who refused testing. 
b Mean = 2.67 instruments per child. 
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Reliability data for child diagnostic descriptions 
were obtained from each clinic psychologist. The 
two psychologist trainees repeated sorts of their 
final case 2 days following their original Q-sort 
descriptions. Reliability estimates were obtained 
from each staff psychologists for each of five 
children assigned to them for treatment. The lat- 
ter resort descriptions also served as criterion 
measures and were completed approximately 3 
months following the date of the diagnostic Q- 
sort descriptions. 


Social Caseworkers 


Diagnostic and criterion Q-sort descriptions of 
mothers were compiled by four staff psychiatric 
caseworkers. Each had been trained at the Uni- 
versity of Minnesota School of Social Work, and 
had had at least 7 years of casework experience. 
Their experience at the study clinic ranged from 
2 to 12 years. In Table 14, the caseworkers are 
identified as W, X, Y, and Z, and descriptive data 
are presented for each. 

The caseworkers' initial task involved soliciting 
parent cooperation in the testing program. Parents 
were first apprised of the testing request by the 
coordinator of casework service (Chief Social 
Worker) at the time of intake. They were told 
that the clinic was engaged in a research program 
designed ultimately to aid in a better understand- 
ing of parent-child relations and that all parents 
were requested to complete a personality inventory 
(MMPI) early in the course of diagnostic study. 
Parents were then reminded of the testing request 
at the close of their first diagnostic appointment. 
Caseworkers were responsible for scheduling time 
sufficient to insure completion of the inventory 
before the date of the diagnostic conference. In 
the majority of cases, appointments were sched- 
uled the hour preceding or following the second 
diagnostic interview. The clinic receptionist ad- 
ministered tests and collected the booklets and 


TABLE 14 
DESCRIPTIVE DATA CONCERNING PSYCHIATRIC 
CASEWORKERS 
Q Sorts* 
Months of Casework Mothers 
Case- Experience (N = 42) 
worker 
Study | Else- 
Clinic | where | Total| N % 
W 145 72 |217 15 35.7 
X 98 0 | 9% | 11 26.2 
Y 87 9 | 96 | 10 23.8 
7 25 | 60 | 85 6 14.3 


* Includes both diagnostic and criterion descriptions. 


answers sheets. The latter were removed from 
the clinic daily by the investigator. No profes- 
sional personnel had access to MMPI data at any 
time during the course of the study proper. Test 
data were made available to them only upon com- 
pletion of criterion Q-sort descriptions. As a 
result of the above procedure, every mother com- 
plied with the testing request. The return for 
fathers was 65% (N = 32). It is noteworthy that 
no excessive pressure was extended to secure par- 
ent cooperation and, in retrospect, it was believed 
that had greater effort been made, more fathers 
would have cooperated. 

The diagnostic period for parents was identical 
in length to that for children (cf. children above). 
The number of scheduled interviews during this 
period ranged from 2 to 5 and averaged 3, The 
number of parents assigned to each caseworker 
was primarily determined by availability and case 
load. кз 

Upon completion of the final diagnostic inter- 
view and prior to the diagnostic conference, each 
caseworker compiled a Q-sort description of the 
mother using the same items employed by the 
diagnostician and therapist in describing the child, 
Each description constituted a diagnostic report 
and served as the basis for comparative evalua- 
tions with similar descriptions independently de- 
rived from the MMPI. 

Treatment of the mother was in all instances 
undertaken by the caseworker assigned diagnostic 
responsibility for the case. Thus each caseworker 
who compiled a diagnostic description also compiled 
a criterion description of the same mother follow- 
ing an average of seven postconference interviews. 
The number of treatment interviews ranged from 
three to nine, and the minimum number of total 
interviews (ie, diagnostic and treatment com- 
bined) was six. The families participated in the 
study for an average of 104 days (range 56 
to 167). 


MMPI Judges 


Seven nonclinic judges performed the somewhat 
formidable tasks set forth in the design, There 
were discernible differences among them in aca- 
demic status, time spent in clinical settings, type 
of a setting in which they worked, and experience 
with the MMPI. Six were clinical psychologists 
who either had experience in child guidance set- 
tings or were familiar with child guidance popu- 
lations. One was a child psychologist with no 
clinical experience. Four judges held PhD degrees 
and three were third-year graduate students, then 
enrolled in doctoral programs at the University 
of Minnesota, Finally, in terms of MMPI experi- 
ence, four judges could be classified as “experts” 
and three could be classified as “neophytes.” 

All judges were preselected by the investigator. 
Two factors were considered of utmost impor- 
tance in their selection. First, there is mounting 
evidence which shows that clinical predictions 


10 PHILIP A. MARKS 


from adult psychiatric patient MMPI profiles by 
experienced clinicians may have high external 
validity, e.g, agreement with therapists (Duker, 
1958; Halbower, 1955; Little & Shneidman, 1954), 
but that predictions by clinicians lacking experi- 
ence may be only of moderate accuracy (Duker, 
1958), or of limited value (Sines, 1957). With 
few exceptions, the MMPI is not the instrument 
of choice among child guidance practitioners and 
it is indeed doubtful whether many MMPI experts 
exist in child guidance settings. Thus, to evaluate 
the usefulness of the instrument in such settings, 
it was considered important to estimate the ac- 
curacy with which relatively inexperienced persons 
could predict the criterion, and to determine the 
accuracy which might be expected under more 
ideal conditions." Second, another major finding 
of recent research suggests that a thorough knowl- 
edge of the population base rates (ie. the relative 
incidence of various behavior patterns among the 
class of persons under consideration) is essential 
for the effective use of clinical data (Halbower, 
1955; Meehl & Rosen, 1955; Sines, 1957). Thus 
six of the seven judges were selected because of 
their familiarity with child guidance populations. 
Table 15 identifies the judges and reports descrip- 
tive data pertaining to each. 

Two separate prediction tests were required of 
judges by the design. First, Q-sort descriptions 
of mothers were compiled from blind readings of 
MMPI profiles, and second, Q-sort descriptions of 
children were extrapolated from parents’ MMPI 
profile data. For both tasks, judges were provided 
with standard MMPI profiles plus information 
pertaining to the child’s age, sex, grade, and num- 


"Findings which pertain to this evaluation are 
reported in the original paper only (Marks, 
1959b). 


TABLE 15 
DESCRIPTIVE DATA CONCERNING MMPI JUDGES 
Clinical MMPI-Based Q Sorts 
Experi- Mothers and Children 
ence (N = 42)» 
Judges* 
Months N % 

Ie 168 8 19.0 
IIe 120 9 21.5 
IV 108 8 19.0 
V 36 9 21.5 
VI 12 8 19.0 


* Excluding Judges III and VII who derived Q-sort descri; 
tions of adolescents only. The results of these descriptions сап 
be found elsewhere (Marks, 1959Ь), 

b Read: 42 mothers and 42 children. 

* MMPI expert. 


ber of siblings; and the parent's age, education, 
and occupation (hereafter referred to as “minimal 
data"). 

Q-sort descriptions of 42 mothers and 42 chil- 


dren were compiled by five of the seven judges. ` 


Each judge was provided with both parents' 
MMPI profiles (when available) and a mimeo- 
graphed sheet of instructions containing the fol- 
lowing minimal data: 
Enclosed is/are the MMPI profile/s of a moth- 
er and father whose year old boy/girl 
in the ______ grade was recently referred to 
a child guidance clinic, The child is living with 
both parents/the mother only, broth- 
ers and sisters, Both mother and child 
are currently being seen in treatment. Using the 
135 Q statements, describe the mother and child 
S ey. would look to a therapist knowing them 
well. 
In the absence of profile data for father, the 
appropriate reason was given (ie, divorced, de- 
ceased, separated, or refused testing). Typically, 
though not necessarily, the procedure involved 
compiling mother Q-sort descriptions first and 
child descriptions second. Following each sort the 
items were recorded and the Q deck reshuflled. 
After completing sorts for all cases assigned, each 
judge resorted one case selected at random by the 
investigator from among those previously de- 
scribed. The time lapse between sorts averaged 
2 weeks (range 7 to 20 days). The repeat sorts 
served as the basis for reliability estimates of 
judges' predictions of mothers and children. 


Criterion for Child Descriptions 


Seven members of the clinic staff and two resi- 
dents in psychiatry served as child therapists. 
They represented each of the three disciplines in 
the clinic and, like the diagnosticians, differed 
markedly in academic training and years of clini- 
cal experience. The therapists are identified in 
Table 16 and descriptive data are presented con- 
cerning the professional status, clinical experience, 
and number of children evaluated by each. Table 
16 shows that each therapist had at least 3 years 
of total clinical experience and, with the excep- 
tion of the two residents and of one psychiatric 
groupworker, each had spent a minimum of 5 
years at the study clinic. The median months of 
total experience was 90 (range 36 to 444). 

Each child-therapist assignment was made at 
the time of the diagnostic conference, While these 
were based largely upon the child's needs, staff 
availability, interest, and training requisites were 
known to have affected some assignments. Accord- 
ingly, 8 children were assigned to activity groups 
only, whereas 34 children were assigned to thera- 
pists for individual treatment, Each therapist 
knew that his patient was involved in the study 
and would be rated (Q sorted) later in therapy. 
Among the 34 children selected for individual 
treatment, 28 (82%) were assigned to psychia- 
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TABLE 16 
DESCRIPTIVE DATA CONCERNING CHILD 
THERAPISTS 
Criterion 
Months of Clinical Q Sorts 
Experience Children 
Therapist (N = 42) 
Study | Else- 
Clinic | where | Total} № % 
1 324 120 |44 6 14.3 
2 120 120 |240 3 7.1 
3 60 36 96 2 4.8 
4 87 9 96 1 2.4 
$ 66 24 90 9 21.4 
6 72 0 72 4 9.6 
Ts 4 54 58 8 19.0 
8^ 5 31 36 5 11.9 
9 12 24 36 4 9:5 


a Resident in child psychiatry. 


trists, 5 (15%) were assigned to psychologists, 
and 1 child was assigned to a caseworker. Thus, 
for the majority of the cases, where staff psychol- 
ogists served as child diagnosticians, staff and 
resident psychiatrists served as child therapists. 

After the child had been seen for an average of 
10 interviews, the therapist compiled a Q sort 
using the same items employed earlier by the 
psychologist. The actual number of interviews 
varied between 5 and 13. Adopting 5 interviews 
as a minimum was not altogether an arbitrary 
decision, Meehl® has found that correlating 
Q-sort descriptions derived from successive inter- 
views with a criterion of 24 interviews, yields a 
rapidly accelerating curve which reaches an 
asymptote between the fourth and fifth interviews. 
Ergo, the decision was made to include all chil- 
dren in the sample who remained in treatment for 
a minimum of 5 hours, but that therapists would 
not otherwise compile criterion descriptions until 
after the tenth interview. 

The express purpose of the present design pre- 
cluded uncontaminated clinic Q-sort descriptions. 
No attempt was made to deprive therapists of 
psychometric data about children, for such data 
are normally available to them and therapist's 
judgments are typically founded to some extent 
upon data available in psychological reports. 
Moreover, 5 of the 42 children were assigned for 
treatment to the psychologist who had evaluated 
(and Q sorted) them diagnostically. The net re- 
sult was that in 37 cases the criterion descriptions 


в Personal communication, Раш E. Meehl, No- 
vember 1958. 


were confounded to some extent by psychometric 
data, and in 5 cases the diagnostician served as the 
child therapist. 

Notwithstanding the accessibility of clinic fold- 
ers, test results, and psychological reports, thera- 
pists were instructed to rely upon their own 
therapy notes in deriving criterion descriptions. 
The latter was intended to effect more accurate 
descriptions rather than delimit data which were 
normally available to them. 

The major consequence of the above procedure 
warrants comment. Comparing judgments made 
by different persons on essentially the same data 
(or by the same persons on the same data) yields 
information concerning the reliability of judgment 
only. However, once such information is obtained, 
it can usefully serve as an estimate of the maxi- 
mum validity for judgments independently derived 
from other sources. For example, one objective 
of the study was to evaluate the comparative ac- 
curacy of psychologists’ diagnostic descriptions of 
children and of descriptions of the same children 
independently derived by judges from blind read- 
ings of parent MMPI profiles. Confounding the 
criteria with psychometric data would only result 
in spuriously high estimates of accuracy for clinic 
psychologists. Thus, under these circumstances, 
the extent to which judges’ descriptions were 
found to approximate (or to surpass!) the accu- 
racy achieved by psychologists, would only be 
enhanced if the criteria were derived independent 
of the data which served as the source of the 
latters’ initial judgments. It follows that the 
criteria must also be derived independent of the 
MMPI. This was assured of course by removing 
the answer sheets from the clinic immediately 
upon completion of testing. MMPI data were 
inaccessible to therapists prior to their criterion 
evaluations. 

Reliability data were obtained for child criterion 
descriptions by having five therapists compile re- 
sorts of their final case within 2 weeks of their 
initial criterion descriptions. 


Estimating the Accuracy of the Criterion 


Owing to the fact that each child criterion de- 
scription was compiled by a single clinician (the 
child’s therapist), a source of variance in the cri- 
terion was introduced which in part can be attri- 
buted to individual differences in clinical frames of 
reference, and in part reflects differences in clinical 
ability among therapists. That this variance could 
have deleterious effect upon diagnostic predictions 
seems obvious; ie, the most direct effect would 
be to lower the Q correlations (attenuating the 
true variance) of the various data-based predic- 
tions. To obviate such a state of affairs, a precau- 
tion was taken to insure at least moderate validity 
of the criterion. The correlations of therapists’ 
criterion descriptions with psychologists’ diagnos- 
tic descriptions were compared for each child. 
Since the criterion was contaminated by psycho- 
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metric data, it was decided that the lowest average 
(acceptable) agreement between clinic therapists 
and psychologists should exceed chance. There- 
fore, a therapist whose average correlation across 
psychologists was not significantly greater than 
zero at the .001 level or beyond should be elimi- 
nated and his descriptions discarded from further 
analysis, Table 17 presents the average correla- 
tions of clinic therapists with clinic psychologists 
for the 37 children described by both. 

Obvious from Table 15 is the fact that none 
of the criterion descriptions compiled by Thera- 
pist 9 showed better than chance agreement with 
diagnostic descriptions compiled by clinic psychol- 
ogists. This was a most disconcerting finding and 
suggested a definite lack of communication be- 
tween this therapist and other members of the 
clinic staff. For the purpose stated above, the 
four criterion descriptions compiled by Thera- 
pist 9 were discarded and the number of children 
included in the sample was reduced to 38, 

When therapists’ total clinical experience (q.v. 
Table 16) was compared with their average agree- 
ment with psychologists, and a rank order correla- 
tion computed, a coefficient of .57 was obtained 
(p = .10). While not of acceptable statistical 
significance, it suggested some relationship be- 
tween diagnostician-therapist agreement and thera- 
pist clinical experience. 


Selecting the Q Array 


A. decided advantage of the Q sort in psychiat- 
ric research is that it facilitates development of a 
standard language of behavior whereby different 
clinicians, making independent interpretations of 
similar or dissimilar data, can communicate im- 
Pressions in an objective form suitable for statis- 


TABLE 17 


MEAN CORRELATIONS OF THERAPISTS’ CRITERION 
DESCRIPTIONS WITH PsyCHOLOGIsTS’ 
DIAGNOSTIC DESCRIPTIONS 


(N = 37)* 
Therapist» N Mean Range 

1 6 Al .28—.60 
4 1 47 Ix 

5 9 .29 .05—.48 
6 4 .49 -21—.65 
Te 8 .48 .26—.67 
8° 5 .27 -20—.44 
9 4 —.09 —.33—.16 


* Excluding five children assigned to clinic psychologists for 
аши ems ists 2 and 3 (P 
Excluding Therapists 2 and 3 (Psychologists A and B) who 
compiled diagnostic Q-sort descriptions, н 
e Resident in child psychiatry, 


tical evaluation. Furthermore, if the language is 
so developed as to be suitable for children and 
adults of both sexes, parent-child comparisons can 
be made. While the latter was not a feature of 
the present study, items were selected with this 
expediency in mind. In addition to age-sex non- 
specificity, several other criteria were used to 
assure the pertinence, ratability, variability, and 
comprehensive coverage of the statement pool, 


Sources of Items 


In attempt to secure a comprehensive sample of 
items representative of the extensive domain of 
personality, and sufficient for the description of 
any case, normal or pathological, several sources 
were explored. Approximately 100 items were ob- 
tained by the investigator, who culled descriptive 
and inferential statements from the folders of 
cases filed at the study clinic; 205 items were 
borrowed from a pool compiled earlier by Paul E. 
Meehl; and 1,538 items were acquired from a 
pool then under development at the University of 
Minnesota by the Ford Foundation Research Proj- 
ect on Diagnosis in Psychiatry. 

The purpose of the Ford project has been 
described (Duker, 1958) as “а research program 
for investigation of personality dimensions having 
maximum descriptive precision and predictive 
power in the understanding of psychiatric patients" 
(p. 64). The initial phase of this program was 
devoted to the compilation of 6,000 descriptive 
items which were presumably representative of 
the personality sphere “entirely and accurately.” 
The following areas were sampled : Interests, avo- 
cational; Interests, vocational; Value-orientation; 
Primary-group relationships; Attitudes; Mood 
and temperament; Vocational activity; Manifest 
interpersonal patterns; Psychopathology; Ethical 
behavior; Self-concept; and Ego-organization and 
"character structure" (Studdiford, unpublished). 


Once compiled, the pool was submitted to “arm- 
chair" screening and items were eliminated which 
were judged to be any of the following: ambigu- 
ous, rare, genotypic, unknown to therapists, atom- 
istic, synonymous, extreme, unfamiliar, evaluative, 
broad, “double-barreled.” The intact pool at the 
time of the present study contained 1,538 items 
which had survived the stringent screening cri- 
teria. As alluded to above, the pool contained 
phenotypic items only, ie, items which refer to 
“relatively surface, objective, descriptive, ‘sum- 
marizing aspects of behavior" (ie, manifest 
traits) as opposed to genotypic, which refer to 
“internal events, states, structures inferred from 
the phenotype, and often said to ‘explain’ it" (ie, 
latent traits) (Meehl, unpublished). Since the 
purpose of the present study was to standardize a 
pool containing both genotypic and phenotypic 
item content, Meehl's pool served as the primary 
source of genotypic items. Thus, from the sources 
mentioned above, approximately 2,000 items were 
obtained for potential selection, 
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In view of the relatively objective criteria of 
age and sex nonspecificity, the pool was first 
screened by the investigator who eliminated 1,381 
items which were either duplicated, or a priori 
seemed inappropriate for adults and children of 
both sexes. The remaining 619 items comprised 
the basic pool which was then submitted to 
empirical screening by the study clinic staff. 


Screening Items for Clinical Pertinence 


The 619 items were then submitted to two staff 
psychiatrists who were instructed to rate for 
"clinical pertinence." For an item to be considered 
pertinent, it had to be nonspecific with respect to 
age and sex, and had to contain information com- 
monly discussed at team meetings, case conferences, 
etc., and/or have diagnostic or treatment implica- 
tions, Although the first criterion was considered 
by the investigator in the original selection of 
items, it was believed necessary to have the pool 
rerated on this criterion to insure elimination of 
any item missed during the initial screening proc- 
ess. Thus, the judges were instructed first, to 
evaluate the appropriateness of each item for 
adults and children, and second, to consider the 
pertinence of each item for the clinic population. 
The categories of “suitable,” “questionable,” and 
"unsuitable" were designated for this purpose. 
Any item considered inappropriate for adults and 
children was automatically rated “unsuitable.” All 
other items were considered pertinent and rated 
either “questionable” or “suitable.” 

Of the 619 items submitted for screening, 476 
were rated suitable, and 32 were rated question- 
able by both judges. The interjudge agreement 
was 82%. For all remaining items the judges were 
mixed in agreement. Only those items were re- 
tained for further screening which either both 
judges agreed were suitable (476 items), or one 
judge rated suitable and the other rated question- 
able. The latter combinations comprised 43 and 56 
items, respectively. Of the 619 items thus screened, 
575 met the criterion of clinical pertinence. 


Screening Items for Ratability 


Next, the 575 items were submitted for judg- 
ments of "ratability" by each remaining member 
of the clinic staff, Accordingly, six caseworkers, 
four psychologists, and three group workers were 
instructed to judge each item in terms of its 
ratability based upon the specific source of data 
available to them (ie, interview, psychometric, 
and observation, respectively) following a typical 
diagnostic study. The three categories of “rat- 
able,” “questionable,” and “unratable” were desig- 
nated for this purpose. To facilitate an analysis 
of judgments, the categories were assigned dif- 
ferential weights and the item values were 
summed across judges within each discipline. The 
ratable category was given a weight of 2, the 


questionable category a weight of 1, and the un- 
ratable category 0 weight. 

The items were then analyzed for inter- and 
intradiscipline ratability with the dual purpose of 
sampling items judged ratable by every clinician 
and items presumably judged ratable on the basis 
of the specific source of data available to each 
discipline. Intradiscipline ratability was based on 
an item’s receiving unit weight by each clinician 
and a weighted value greater than 50% of the 
possible total value for that discipline. For ex- 
ample, if each psychologist (N = 4) judged an 
item questionable, or, if three psychologists 
judged an item questionable and the fourth judged 
it unratable, the weighted value for that item 
would be either 4 or 3, respectively. The per- 
centage value, however, would in neither case 
exceed 50% of the possible value of 8. For any 
item to be accepted, ergo, there must have been at 
least questionable agreement among judges and 
the item must have received a weighted value of 
5 or more. 

An analysis of judgments revealed that 213 
items had received a maximum value of 26, indi- 
cating complete staff agreement as to their rata- 
bility. Ten items were judged either questionable 
or unratable by every clinician and received a 
weighted value of 13 or less. An example of an 
unratable item was: “Has a wish to be passively 
manipulated genitally.” 

Furthermore, 65 items showed weighted values 
indicating intra- but not interdiscipline ratability, 
and were presumably judged ratable on the basis 
of data peculiar to each discipline. Of this item 
total, 30 were ratable by caseworkers, 20 by psy- 
chologists, and 15 by group workers. The follow- 
ing are examples of these items for each discipline. 

Casework: “Gets a good deal of satisfaction 
out of being able to control what goes on in 
the home.” 

Psychology: “Has a wish to kill people who 
thwart self in any way.” 

Groupwork: “Thrives on social (peer) group 
competition.” 

The remaining 331 items varied in value from 
13 to 25. Because such a large number of these 
items were apparently of acceptable ratability 
(i.e, values between 20 and 25) it was arbitrarily 
decided to adhere to a stringent criterion for 
selection in order to reduce the size of the pool. 
Thus, only those items were retained for further 
screening which received a maximum ratability 
value of 26, and appeared ratable on the basis of 
data peculiar to each discipline, To provide equal 
representation for the latter, 15 items were 
sampled from each discipline. This procedure 
reduced the pool to 258 items. These items were 
sorted in the final screening process described 
below. 

Before concluding this section it is important to 
note that no step was taken to insure the ratability 
of the items by the MMPI judges. This proce- 
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dure was purposely omitted in order to develop 
a unified pool appropriate for child guidance popu- 
lations and usable by child guidance personnel. 
When MMPI judges were subsequently asked to 
judge items for ratability from the MMPI alone, 
approximately 10% of the items were uniformly 
judged unratable. When consideration is given to 
the 15 unratable items peculiar to each discipline, 
any bias is canceled out. Thus it can probably be 
concluded that the items were not prejudiced in 
favor of any data or discipline. 


Screening Items for Interpatient Variability 


The final step in the item selection procedure 
was designed to increase the discriminating power 
of the item pool. This phase involved an attempt 
to filter the pool of “Barnum algae" by eliminat- 
ing all items that could be made to fit mothers 
and children largely or wholly by virtue of their 
triviality (Paterson, 1951). This issue has been 
discussed in greater detail by Meehl (1954), and 
is considered by others (Cronbach, 1953; Duker, 
1958; Goodling & Guthrie, 1956; Halbower, 1955) 
to be an essential step in item selection. Ergo, to 
incorporate such discrimination insurance into the 
design, nine clinic therapists were requested to 
compile Q-sort descriptions of mothers and chil- 
dren then undergoing treatment. All therapists 
were experienced staff clinicians. 

To insure a heterogeneous patient sample, five 
child therapists and four caseworkers were asked 
to submit a list of patients seen in treatment for 
a minimum of 10 hours. From this list the inves- 
tigator chose at random four mothers, one child 
whose primary complaint was academic deficiency, 
and four children from each of the following 
diagnostic categories: psychotic disorder ; psycho- 
neurotic disorder; personality disorder; and psy- 
chophysiologic, autonomic, and visceral disorder, 
The 258 items were then sorted by therapists into 
an ll-category, semiforced, rectangular distribu- 
tion. The therapists were instructed to sort a 
minimum of 15 items into each category according 
to the degree to which the items were character- 
istic of the person being described. The nine Q 
sorts which resulted from this procedure were 
used for the express purpose of eliminating items 
which showed little variability in category place- 
ment. 

An item analysis was done as follows: the cate- 
gory value for each item placement (ie, 1, 2, 3 
. . . 11) was tabulated for all nine subjects. А 
"variance score" representing the number of con- 
tiguous categories of descriptiveness was com- 
puted for each of the 258 items. The process of 
item elimination was carried out for mothers and 
children jointly. АП items with variance values 
less than 6 (i.e, varied over less than half the dis- 
tribution) for mothers, children, or both, were 
eliminated. The following are examples of dis- 
carded items. 


Mother: “Utilizes rigid, constrictive, suppres- 
sive, emotional control.” “Has inferiority feel- 
159 

Child: “Passivity (latent ог  manifest)." 
"Tends to be self-defensive; anticipates being : 
attacked and criticized." 

Mother and Child: “Utilizes repression as a 
defense mechanism," "Is concerned with own 
adequacy as a person, either at a conscious or 
unconscious level.” 


Final Q Array 


The Q array in its final form consisted of 135 
items which were believed to have the following 
attributes: reasonable representation of the broad 
domain of personality, appropriate for adults and 
children of both sexes, composed of pertinent 
clinical information, ratable by each member of 
the clinic staff, and adequate interpatient vari- 
ability, ie, relatively free of “Barnum-effect.” 

Twenty-five percent of the pool contained non- 
pathological statements. Of these, 9 items were 
neutral and 25 items were positive with regard 
to normal adjustment. Approximately 25% of the 
pool was judged to be comprised of genotypic 
item content. The 135 items retained in the final 
pool are presented in Appendix B. 


Mother and Child Stereotype Descriptions 


There is mounting evidence to the effect that, 
when provided minimal data regarding class mem- 
bership (e.g, sex, age, education, etc, or that a 
person has patient status) judges can make per- 
sonality predictions that exceed chance expectancy 
(Caldwell, 1958; Duker, 1958; Gage, 1953; Hal- 
bower, 1955; Hathaway, 1956; Kostlan, 1054; 
Sines, 1957). This issue has been discussed by 
Gage and Cronbach (1955) who conceptualize 
(predictive) accuracy scores as divided into two 
components: “stereotype accuracy" and “differ- 
ential accuracy." 

Where the former refers to a judge's ability to 

predict the pooled responses of a given cate- 

gory of persons, the latter refers to ability to 
differentiate among persons within the category. 

‚+. When accuracy is scored directly, no dis- 

tinction can be made between the two com- 

Ponents which contribute to the judge’s success 

(p. 417). 

To obviate such a state of affairs, the writers 
advocate obtaining at least two scores: (a) ability 
to predict the typical behavior in the next-larger 
class to which the person belongs (e.g, an "aver- 
age” patient), and (b) ability to predict how à 
person deviates from the norm of this class. Tes- 
timony that accuracy scores warrant such af 
analysis is provided by a series of recent PhD 
dissertations at the University of Minnesota 
(Duker, 1958; Halbower, 1955; Sines, 1957). In 
each of these studies, the predictive efficiency of 
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judges’ patient descriptions was compared with the 
accuracy of their patient stereotypes, and in many 
instances the stereotype was found to be the more 
accurate predictor. 

To control for stereotype accuracy in the present 
study, each clinician who derived diagnostic Q 
sorts was asked to describe (by Q sort) his con- 
ception of the "average child," and the "average 
mother" referred to a child guidance clinic for 
evaluation and treatment. To obtain an estimate 
of stereotype reliability, each clinic psychologist 
compiled a resort approximately 9 months after 
his first stereotype description, Every clinician 
completed his stereotype descriptions before as- 
signment of study subjects. Stereotype sorts, as 
well as all subject sorts, were forced into а quasi- 
normal, ll-category distribution. The latter was 
implemented to facilitate the numerous compu- 
tations required. 

The stereotype mother and child descriptions 
provided optimal chance estimates against which 
"differential accuracy" scores (ie, subject pre- 
dictions) could be compared. Two such com- 
parisons were made for each subject in the study. 
Fitst, each clinician's mother or child description 
was compared with his own stereotype for each 
mother and/or child described by him. Second, 
each clinician's mother or child description was 
compared with one of four composite clinic stereo- 
types (ie, "mean average subject"). The latter 
were derived by fractionally omitting one stereo- 
type and forming a composite of the remaining 
three. It was thus possible to compare the relative 
accuracy of MMPI-based descriptions and clinic- 
based descriptions, with a clinic-based composite 
stereotype which was free of each clinician's base 
rate bias. 


RESULTS 
Reliability of Clinical Evaluators 


Therapists’ Child Criterion Descriptions. 
Intrasort reliability estimates (coefficients 
of stability) were computed from the resort 
descriptions of five clinic therapists who 
had compiled four or more criterion de- 
scriptions during the study. The number 
of days between sort and resort ranged 
from 2 to 14, and averaged 10. The coeffi- 
cients were .93, .86, .85, .85, and .82. The 
five coefficients were transformed into 2 
scores, averaged, and converted to a mean 
coefficient of .87. All coefficients were 
Significantly greater than zero beyond the 
001 level. 

_ Psychologists’ Child Diagnostic Descrip- 
tions, Reliability estimates for child diag- 
nostic descriptions were obtained from each 


clinic psychologist. Two clinical psycholo- 
gist trainees repeated sorts of their final 
case 2 and 3 days following their first Q- 
sort descriptions. These coefficients were 
.80 and .79 for the 2 and 3 day sorts, re- 
spectively. Sort-resort correlations were 
also computed for each of five children as- 
signed to staff psychologists for treatment. 
In each case, the psychologist who served 
as diagnostician (and who compiled diag- 
nostic descriptions), also served as thera- 
pist and compiled the criterion description 
for the same child. For each child, the sec- 
ond description was separated from the first 
by approximately 3 months. The correla- 
tions obtained were .73, .62, .62, .56, and 
55. Тһе mean coefficient calculated 
through z transformation was .62. Again, 
all correlations were reliably greater than 
zero at well beyond the .001 level. Where- 
as the latter comparisons were made pri- 
marily to reflect the degree to which psy- 
chologists changed their impressions of 
children as a function of therapeutic con- 
tact with them, the coefficients can also be 
interpreted as estimates of the absolute 
minimum stability that might be expected of 
the О array over an extended period of 
time. 

Child Stereotype Descriptions. Four 
clinic psychologists and seven nonclinic 
MMPI judges derived child stereotype de- 
scriptions. Although only a fraction of the 
55 possible intercorrelations were computed, 
it appeared that they would range from 38 
to .62, and average about .52. The latter 
estimates were calculated by intercorrelat- 
ing the four stereotype descriptions derived 
by clinic psychologists and then correlating 
each of the seven judges’ stereotypes with 
the one clinic stereotype that most agreed 
with the other three. Accordingly, the child 
stereotype derived by Psychologist C was 
correlated with the seven judges’ stereo- 
types. The coefficients obtained were .62, 
56, .52, .51, .49, 46, and .38. Table 18 re- 
ports the intercorrelations for the four 
clinic psychologists. 

In addition to the reliability estimates re- 
ported above, intrasort correlations were 
computed for child stereotype descriptions 
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ТАВГЕ 18 


INTERSORTER RELIABILITY OF PSYCHOLOGISTS’ 
CHILD STEREOTYPE DESCRIPTIONS 


MARKS 


TABLE 19 


INTERSORTER RELIABILITY OF CASEWORKERS’ 
MOTHER STEREOTYPE DESCRIPTIONS 


Psychologist A B C» Caseworker W x ў 
В .53 x 43 
Cs .53 .54 M -61 -62 
Des 44 .39 +52 Z .60 .43 .49 


* Third-year clinical psychologist trainee. 


which were resorted by the clinic psycholo- 
gists. The time interval between sort and 
resort was approximately 9 months. The 
correlations obtained were .83, .70, .65, and 
-60. The mean coefficient computed through 
2 transformation was .70. All of the above 
correlations were reliably greater than zero 
well beyond the .001 level. 


Caseworker? Mother Descriptions. Be- 
cause of the excessive demand made upon 
caseworkers’ time, no specific step was taken 
to secure repeat sorts of their diagnostic 
or criterion descriptions. Although esti- 
mates of (maximum) reliability were not 
available for either of these descriptions, a 
measure of minimum stability can be in- 
ferred from the correlations between them. 
The first and second mother descriptions 
were in most cases separated in time by 
approximately 3 months. The 42 sort- 
resort correlations ranged from .30 to .84. 
The mean coefficient computed through z 
transformation was .65. All of these corre- 
lations were significantly greater than zero 
beyond the .001 level. 

It is of interest to note that the average 
coefficient obtained by correlating the four 
caseworkers’ diagnostic and criterion 
mother descriptions (.65) approximates 
that similarly obtained by correlating the 
two psychologists’ diagnostic and criterion 
child descriptions (.62). In both instances, 
the sort-resorts were separated in time by 
approximately 3 months. 


Mother Stereotype Descriptions. Relia- 
bility estimates of mother stereotypes were 
calculated in a manner similar to that of the 
child stereotypes reported above. Four 
clinic caseworkers and five nonclinic MMPI 


judges derived "average mother" descrip- 
tions. Although only a fraction of the 36 
possible intercorrelations were computed, it 
seemed that they would range from .25 to 
.62 and average about .52. The latter esti- 
mates (i.q. the child stereotypes) were com- 
puted by intercorrelating the four stereo- 
types compiled by the clinic caseworkers 
and then correlating each of the five MMPI 
judges' stereotypes with the one clinic stere- 
otype that most agreed with the other three. 
Accordingly, the mother stereotype com- 
piled by Caseworker Y was correlated with 
each of the five judges’ descriptions. The 
coefficients obtained were .58, .58, .52, .46, 
and .25. Table 19 reports the intersorter 
correlations of four clinic caseworkers, All 
coefficients were reliably greater than zero 
beyond the .001 level. 

A comparison of the correlations obtained 
from the child stereotype descriptions with 
those obtained from the mother stereotypes, 


TABLE 20 


INTRASORTER RELIABILITY OF Jupces’ MMPI- 
BASED MOTHER AND CHILD DESCRIPTIONS 


Days 
Judges between | Mother Child 
Sorts 
1» 14 US .75 
п» 21 .81 15 
IV 14 .85 «19 
VE 7 +75 .88 
уе 14 -63 .45 
АП sorts 277 .75 


* Excluding Judges III and VII who derived Q-sort descrip- 
tions of adolescents only. The results of these descriptions can 
be found elsewhere (Marks, 1959b). 

b MMPI expert. 

* MMPI neophyte. 
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showed that the median coefficients of both 
groups were the same (.52). Moreover, 
with exception of the judge's stereotype 
which correlated only .25 with the modal 
description, the range of the two distribu- 
tions was similar (child .38 to .62; mother 
43 to .62). 


Judges MMPI-Based Descriptions. Re- 
liability estimates of mother and child 
MMPl-based descriptions were obtained 
from each judge. Five judges compiled re- 
sorts of mothers from MMPI profiles of 
mothers and of children from MMPI pro- 
files of parents in a manner identical to 
that of their original mother and child de- 
scriptions. The intrasort reliability esti- 
mates for these descriptions are reported in 
Table 20. 

It can be seen from Table 20 that the cor- 
relations for mothers ranged from .63 to 
85 and averaged .77. For children, the 
correlations ranged from .45 to .88 and 
averaged .76. All coefficients were reliably 
greater than zero well beyond the .001 level. 


Results of Testing the Hypotheses 

Hypothesis I 

Findings related to the first two hypoth- 
eses are presented in Tables 21 through 
26. Tables 21 and 24 present means and 
standard deviations of MMPI scales for 
the two clinic samples and for two Min- 
nesota general population normative groups 
selected for comparison. Since both groups 
differed reliably in the variance of several 
scales, the difference between means was 
computed by the approximation method 
given by Cochran and Cox (1957). 

Hypothesis I posited personality variable 
differences between mothers of clinic pa- 
tients and general population female adults. 
The results of testing this hypothesis appear 
in Tables 21, 22, and 23. Table 21 reports 
T score means and raw score means and 
standard deviations of MMPI clinical and 
validity scales for the clinic sample of 48 
mothers and for 315 female adults of the 
general Minnesota population. Examination 
of Table 21 shows that L and Pd scales 


TABLE 21 


MMPI VALIDITY AND CLINICAL SCALE Scores or MOTHERS or CHILD GUIDANCE CLINIC 
PATIENTS AND OF GENERAL POPULATION FEMALE ADULTS 


Mothers Mothers Females 

(N = 48) (N = 48) (N = 315) 
Scale T Scores Raw Scores Raw Scores 

Mean Mean SD Mean SD t F 

L 50.5 4.4 1.99 4.3 2.63 .43 1.74** 
E 50.9 3.8 2.81 3:5 3.13 .64 1.24 
K 56.7 15:5 4.58 12.1 5.07 4.64*** 1.23 
Нз» 54.5 15.3 5.13 13.1 4.88 2.74** 1.10 
D 58.2 23.5 5.04 19.3 5.18 5.33** 1.05 
Hy 60.7 24.9 6.13 18.8 5.66 6.34** 1.11 
Ра» 61.1 23.3 5:32 18.4 4.40 5.95*** 1.46* 
Mf 49.7 36.3 4.82 36.5 4.83 .21 1.00 
Pa 55.7 9.8 2.89 7.9 3.32 4.15*** 1.33 
PI. 54.8 28.1 5.47 25.2 6.06 3.34 1.23 
Sch 54.3 25.5 5.64 22.6 6.50 3.12** 1.33 
Ма» 47.7 15.2 4.82 16.1 4.11 1.24 1.38 
Si 54.9 29.7 8.75 25.0 9.58 3.35** 1.19 


a K-corrected scales. 
* Significant between .10 and .05 levels. 
** Significant at .05 level. 
*** Significant at .01 level. 
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yield F ratios significant at the .10 level or 
less when a two-tailed test is applied. More- 
over, clinic mothers showed reliably higher 
mean scores on 9 of the 13 scales tested. 
АП clinical scales with the exception of Mf 
and Ma were found reliably higher than 
the norm at the .05 level or beyond. The 
largest differences occurred on Ну and Pd. 
And, as might be expected, the Ну and Pd 
differences were also prominent in the fre- 
quency of clinical scale high points, the 
frequency of T scores equal to or greater 
than 70, and the mean MMPI code of clinic 
mothers' profiles. 

Table 22 reports the percentage frequency 
of clinical scale high points as observed 
among the total number of females in both 
groups. The MMPI normative data used 
for this comparison were taken from 
Hathaway and Meehl ( Welsh & Dahlstrom, 
1956, p. 141). Since Mf and Si scale data 
were not reported by these authors, the 
mothers' high point codes were adjusted 
(the second high point was used) to facili- 
tate a comparison with the norm. Table 22 
shows reliable differences between the clinic 
and normative groups on two of eight 
clinical scales. Clinic mothers show a sig- 
nificantly higher percentage frequency of 
Ну and Pd high points, and a reliably lower 


TABLE 22 


FREQUENCY ОЕ MMPI Симса, Scare Hicu 
Ports or MorHERS or CHILD GUIDANCE 
CLINIC PATIENTS AND OF GENERAL Poru- 
LATION FEMALE ADULTS 


Mothers | Females 
Scale (N = 48) | (N = 360) CR 
% % 
Hs 4.2 8.6 EN 
D 16.7 12.5 .81 
Ну 27.1 7.5 4.30** 
Pd 20.8 8.3 2.74* 
Pa 12.5 2:2 = 
Pt 251 6.4 = 
5с 2.1 6.1 - 
Ma 8.3 13.0 = 
No high point 6.2 25.8 3.01* 


== ыш idee Oe 
a Critical ratios were not computed for these scales as qN 
was less than 5 (McNemar, 1949, p. 77). 
* Significant at .006 level. 
** Significant at .002 level. 


frequency of no high point than is found in 1 
the general population. Moreover, Н y and _ 
Pd appeared as high points in approxi- 
mately 50% of mothers’ profiles as com- 
pared with a frequency of 16% in profiles 
of female adults. The high point which ap- 
peared most frequent among mothers was 
Hy (27%), whereas the most frequent high 
point in the normative group is Ma ( 13%). ~ 
"Table 23 reports percentages of mothers 
and normals whose T scores equal or ex- 
ceed 70 on each of eight clinical scales. 


"Again, normative frequencies were not 


available for Mf and Si. Table 23 shows T 
Scores equal to or greater than 70 (ab- 
normal scores) occur with about equal fre- 
quency (1 to 396) in the general popula- 
tion, and that approximately 14% of 
females in the normative group obtain a 
Score of 70 or greater on at least one 
clinical scale. Unfortunately, the popula- 
tion percentages were too small to permit 
a statistical test of individual scale differ- 
ences. The total percentage difference was 
computed, however, and indicates that a re- 
liably greater number of mothers obtain 
abnormal scores than might be expected of 
a representative sample of the general pop- 


TABLE 23 


FREQUENCY WITH WHICH MMPI CLINICAL SCALE T 
Scores EQUAL or EXCEED 70 AMONG PROFILES 


or MOTHERS or CHILD GUIDANCE CLINIC 
PATIENTS AND GENERAL POPULATION 
FEMALE ADULTS | 


Mothers | Females 
Scale (N = 48) | (N = 360)| СА 
% % 
Hs 0 2.8 ш 
р 4.2 2.8 - 
Hy 14.6 1.1 - 
Pd 14.6 .8 - 
Pa 4.2 1.4 - 
Pi 2.1 1.9 - 
Sc 0 1.4 - 
Ma 2.1 1.7 - 
Total 9% 41.8 13.9 4.82* 
Total %° 50.0 13.9 6.10* 


* Critical ratios were not computed for these scales as gN 
was less than 5 (McNemar, 1949, р, 77). 

> Excluding Mf and Si scales. 

з Including Mf and Si scales (in the clinic sample only). 

* Significant at .002 level. 
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ulation. The percentages of abnormal pro- 
file scores were 50 and 14 for mothers and 
normals, respectively. Interestingly enough, 
the two scales which show the highest fre- 
quency of abnormal scores in the sample 
(Pd and Hy) have the lowest frequency as 
abnormal scores in the normative group. 

It is also noteworthy that Hy and Pd 
were found to occupy the first and second 
positions among mothers' and fathers' pro- 
files of a sample of parents referred to the 
Washburn Memorial Clinic in Minneapolis 
(Hanvik & Byrum, 1959). The mean rank 
profile codes? of both samples were 34260- 
71859 and 3462780159 for the study clinic 
and Washburn samples, respectively. A 
rank order correlation between these codes 
yielded a coefficient of .93 indicating ex- 
treme similarity of mothers in the two clinic 
populations, 

An examination of clinical scale high 
points revealed five code patterns, which 
combined represented 70% of all profiles in 
the clinic mother sample. 

Codes 13 and 31: This pattern, commonly 
referred to as the “conversion V” or “hy- 
steroid valley,” was found among 29% of 
the mothers’ profiles (cf. 33% of mothers’ 
profiles in the Washburn sample). Some 
follow-up data collected after 6 to 9 months 
of treatment® indicated that, regardless of 
elevation, a majority of cases in which the 
mothers’ code was of this pattern responded 
favorably to treatment. 

Codes 24 and 42, and 34 and 43: As 
compared with the 34 and 43 pattern, 
mothers in the 24-42 group were given 
slightly better ratings by therapists follow- 


"Тһе coding system employed is that described 
by Hathaway (1947). The clinical scales are 
identified by the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 
and 0 in order from left to right in the usual 
profile arrangement (1 Hs, 2 D, 3 Hy, etc.). 

* At the time criterion descriptions were com- 
piled each therapist independently rated each child 
and mother in terms of whether they were able 
to form a "therapeutic relationship." At the close 
of the study each caseworker rated each case in 
terms of response to treatment. The latter ratings 
were made from 6 to 9 months after therapy had 
begun. From an analysis of these ratings, three 
groups were identified as improved, partially im- 
Proved, and unimproved. The improved group 


ing 6 months of treatment. Neither group, 
however, presented a very favorable prog- 
nosis. Mothers with the 34-43 pattern can- 
celed many interviews and seldom remained 
in treatment beyond 3 months. Any eleva- 
vation on Scale О (Si), especially if equal 
to or greater than a T' score of 60, was a 
malignant sign. Hanvik and Byrum found 
a high Pd minus Ma differential frequently 
obtained by mothers who were situationally 
maladjusted and reacting adversely and with 
extreme hostility to a portion of the social 
environment, and who presented problems 
of severe marital conflict. The average Pd 
minus Ma score of their mother group was 
9.7. This differential was also computed for 
mothers of the present study and found to 
average 12.8. An examination of the Pd 
minus Ma differential for the groups rated 
for response to treatment revealed scores 
of 2.8, 15.6, and 20.4 for the improved, 
partially improved, and unimproved groups, 
respectively. The relative magnitude of the 
score for the partially improved group is of 
especial interest when the criteria for the 
rating are known. This group was com- 
prised of cases in which the child’s thera- 
pist had judged the child improved, where- 
as the caseworker had judged the mother 
unable to form a relationship. Thus, as 
might be expected, the Pd minus Ma differ- 
ential of the partially improved group more 
closely approximates that of the unimproved 
than of the improved group. 

Codes 20 and 02: The most frequent pat- 
tern associated with the partially improved 
group was 20-02. Scale 0, especially if as- 
sociated with Scale 2 in second position, 


was comprised of 15 cases in which both mother 
and child were able to form a relationship and 
subsequently were able to respond favorably to 
treatment. The partially improved group was 
comprised of 18 cases in which the child was able 
to form a relationship and to respond to treat- 
ment, but the mother was not, The unimproved 
group included 15 cases in which neither the 
mother nor child was able to relate to a therapist 
nor respond to treatment, or the case terminated 
treatment against clinic advice. In several instances 
somewhat arbitrary decisions had to be made 
regarding assignment to groups; however, each 
decision was made independent of knowledge of 
the MMPI. 


20 


occurred among mothers' profiles of half 
the cases judged partially improved, where- 
as only 696 of mothers who obtained this 
pattern received a different rating. 

Code 9: Although relatively infrequent 
as a high point and seldom elevated in any 
profile, Code 9 was nevertheless the best 
single index of improvement. Every case 
in which the mother's profile coded 9 high 
was judged improved. 


Hypothesis II 


Hypothesis II posited personality variable 
differences between fathers of clinic pa- 
tients and general population male adults. 
Tables 24, 25, and 26 report data testing 
this hypothesis. In Table 24, T score means 
and raw score means and standard devia- 
tions of MMPI clinical and validity scales 
are presented for 32 clinic fathers and for 
226 adult males of the general Minnesota 
population. Table 24 shows F, D, Pa, and 
Sc yielded ratios significant at the .05 
level or less. Moreover, for 8 of the 13 
scales tested, scores of fathers were signifi- 


TABLE 24 


MMPI VALIDITY AND CLINICAL SCALE SCORES OF FATHERS OF CHILD GUIDANCE CLINIC 
PATIENTS AND OF GENERAL POPULATION MALE ADULTS 


PHILIP A. MARKS 


cantly higher than the mean of the norma- 
tive group at the .05 level or beyond. The 
only clinical scales which failed to yield 
reliable mean differences were Pa, Ma, and 
Si. The largest differences occurred for D, 
Hy, Pd, and Pt which were all reliable 
well beyond the .01 level. No scale mean 
was observed significantly lower than the 
mean of the normative group. 

In Table 25 the frequency of clinical 
scale high points as observed among males 
in both groups is presented in percentage 
form. Since Mf and Si data were not avail- 
able for adult males, the clinic codes were 
redistributed and the second high point 
used. An examination of Table 25 shows 
that clinic fathers, like clinic mothers, 
showed a significantly higher frequency of 
Hy high point and reliably lower frequency 
of no high point (zero frequency) than is 
found among adults in the normative group. 
The most frequent high point among 
fathers was D, which was significantly 
greater than the norm at the .002 level. 
Whereas Hy and Pd appeared as high 


Fathers Fathers Males 

(М = 32) (М = 32) (N = 226) 
Scale T Scores Raw Scores Raw Scores 

Mean Mean SD Mean SD t 75 

L 50.0 4.0 2.04 4.1 2.89 .05 2.01 
F 51.2 4.4 1.84 3.9 4.24 .39 5.32** 
K 55.0 16.3 4.66 13.4 5.66 3.11* 1.46 
Hs 56.7 14.0 5.04 11.3 3.90 2.84* 1.67 
D 62.0 21.7 6.35 16.6 4.18 4.37** 2.3195 
Ну 59.5 21.7 5.50 16.5 5.51 5.04** 1.00 
Ра* 59.5 23:2 4.12 19.3 4.11 5.05** 1.00 
Mf 54.6 22.8 4.83 20.4 5.13 2.58* 1.13 
Pa 50.0 8.4 2.49 8.1 3.56 .69 2.03* 
Рр 56.8 26.3 5.29 22.9 4.88 3.37** 1.17 
See 54.8 24.8 7.41 22.3 S21 2.07* 2.02* 
Ma^ 51.4 17.6 4.11 17.0 3.87 71 1.18 
Si 51.8 26.8 9.68 25.0 9.58 .94 1.02 


a K-corrected scales. 
* Significant at .05 level. 
** Significant at .01 level. 
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TABLE 25 


Frequency or MMPI Crinicat Scare HIGH 
POINTS or FATHERS OF CHILD GUIDANCE CLINIC 
PATIENTS AND OF GENERAL POPULATION 
Mate ADULTS 


Fathers Males 
Scale (N = 32) |(N = 258) CR 
% % 

Hs 15.6 10.4 .89 

D 25.0 6.6 3.50** 

Hy 21.9 7.0 2.69* 

Pd 15.6 11.6 .66 

Ра 6.2 7.7 — 

Р! 0 6.7 = 

5с 6.2 4.8 = 

Ma 9.4 17.4 1.15 
No high point 0 ОЗ 3.11** 


a Critical ratios were not computed for these scales as gN 
was less than 5 (McNemar, 1949, p. 77). 
* Significant at .006 level. 
** Significant at .002 level. 


points in approximately 50% of mother pro- 
files, Hy and D appeared as high points in 
47% of father profiles. It is notable that 
Pt, which appears elevated with D for a 
large number of psychiatric males, was non- 
existent as a high point among clinic 
fathers. 

Table 26 reports the percentage frequency 
of T scores equal to or greater than 70 
among the clinic and normative male 
groups. An examination of Table 26 shows 
that abnormal scores occur with frequencies 
up to about 5% among general population 
males, and that 15% of normals obtain T 
scores of 70 or greater on at least one 
clinical scale. Although the frequencies of 


. abnormal scores on individual scales were 


too small for statistical comparisons, the 
total percentage difference between groups 
was computed and found significant at the 
002. level. Thus, 44% of clinic fathers and 
50% of clinic mothers obtained abnormal 
scores on one or more scales as compared 
with 15% of males and 14% of females of 
the normative groups. It is notable that 
the most frequent high point of fathers (D) 
is among the most infrequent high points 
of normals, whereas the most frequent high 
point of male adults 
quency among the с 


a en e А 


A comparison of the above data for 
fathers with findings reported by Hanvik 
and Byrum revealed little similarity. The 
mean rank profile codes were 2435718906 
and 5342691870 for the study clinic and 
Washburn samples, respectively. A rank 
order correlation between codes yielded a 
coefficient of .59, which fell short of sig- 
nificance. 

An examination of clinical scale high 
points yielded four code patterns, which 
combined represented 85% of all profiles in 
the clinic father sample. 

Codes 20 and 02: Thirty-seven percent of 
clinic fathers obtained profiles in which 
Scales 2 and 0 were found in prominent 
positions. This pattern was also found fre- 
quent among profiles of clinic mothers (see 
discussion supra) and in general was also 
characteristic of the average profile of the 
total father sample. Profiles with 20-02 
T values of 70 and above occurred with 
a frequency of about 70%. The 20-02 
pattern, more often than not, indicated an 
unfavorable prognosis. However, since ele- 
vation itself was negatively related to im- 
provement, the latter to some extent might 
be expected. It should be noted, moreover, 


TABLE 26 


FREQUENCY WITH WHICH MMPI CLINICAL Scare T 
SconEs EQUAL OR EXCEED 70 AMONG PROFILES 
or FATHERS OF CHILD GUIDANCE CLINIC 
PATIENTS AND GENERAL POPULATION 
MALE ADULTS 


Fathers Males 
Scale (N = 32) |(N = 258) CR 
% % 
Hs 9.4 4.3 — 
р 12.5 1.9 - 
Hy 9.4 E! - 
Pd 6.2 1.9 E 
Pa 0 A - 
Pt 0 1:5 - 
Sc 3.1 E - 
Ma 0 4.3 - 
Total 95^ 40.6 15.1 4.75% 
Total %° 43.7 15.1 4.80* 


a Critical ratios were not computed for these scales as gN 
was less than 5 (McNemar, 1949, p. 77). 
b Excluding Mf and Si scales. | e 
e Including Mf and Si scales (in the clinic sample only). 
ignificant at-.002 le 
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that both 2 and 0 (D and Si) have been 
found sensitive to elevations with age 
(Brozek, 1955; Calden & Hokanson, 1959; 
Sopchak, 1958). And, since clinic fathers 
were significantly older than general popu- 
lation males, some elevation would be ex- 
pected as a function of age alone. Contrary 
to expectation, however, an elevation of 
Scale 2 did not indicate "counseling readi- 
ness" for either parent in the clinic sample. 

Codes 13 and 31: This pattern was found 
to occur in 19% of clinic fathers' profiles 
(cf. 24% of fathers' profiles of the Wash- 
burn sample). The 13-31 pattern of fathers, 
unlike that of mothers, was associated with 
case unimprovement. 

Code 4: In 16% of fathers' profiles Pd 
was the most prominent scale. The Code 4 
pattern was characterized by secondary ele- 
vations on K and Scale 9 with Scales 5 and 
0 generally down. Fifty percent of profiles 
with this pattern showed one or more scales 
with T values of 70 or above. In contrast 
to the finding for mothers, Pd elevation for 
fathers was most often associated with case 
improvement. However, rarely did fathers 
with this pattern participate in clinic treat- 
ment. Despite the relatively high incidence 
of Pd among mothers and fathers, in only 
one case did Pd appear as the high point 
for both. It would be of interest to know 
the incidence of Pd as a high point among 
parents of delinquents! 

Code 95: Scales 9 and 5 appeared in first 
and second position, respectively, in 16% 
of clinic fathers' profiles. In no instance, 
however, was the 95 pattern associated with 
abnormal scores on any scale, nor was Pd 
prominent in any profile. Among these pro- 
files, Scales 1 and 2 were invariably down, 
and often the latter appeared below a T 
value of 50. More often than not, the 95 
code for fathers was associated with case 
improvement (9.0. Code 9 for mothers 
supra). 


Hypothesis ПІ 


Results which pertain to Hypotheses IIT, 
IV, and V were analyzed with nonpara- 
metric statistical techniques. Although para- 
metric statistics are commonly used in Q 


research, the assumptions on which they are _ 
based can seldom if ever be legitimately 
made. For example, the product-moment 
correlational technique necessitates the as- 
sumption that the variables (Q-sort state- 
ments) have been randomly sampled from 
a universe of statements descriptive of the 
personality domain. However, since the 
parameters of personality are largely un- 
known, and since most statements are se- 
lected with some definite expediency in mind 
(e.g., ratability), a departure from the sta- 
tistical model is unavoidable. The most 
common circumvention of the problem has 
been to interpret the coefficients “аз if” the 
sampling assumption could be made. A 
more defensible alternative, however, would 
be to interpret the coefficient as an index 
of similarity only (ie. as means of order- 
ing data) and to process this index with 
statistics of the nonparametric model. In 
the present study both the Sign and Bino- 
mial tests were used for this purpose. 

Hypothesis III posited that judges could 
derive personality descriptions of mothers 
from blind readings of their MMPI pro- 
files which would correlate more highly 
with caseworkers’ therapy-based (criterion) 
descriptions than with descriptions of the 
same mothers derived by the same case- 
workers after only 3 interview hours. Find- 
ings which pertain to this hypothesis are 
reported in Table 27. 

Table 27 presents the mean correlations 
with caseworkers’ diagnostic and criterion 
descriptions of judges’ mother predictions. 
These coefficients were obtained by cor- 
relating each judge's MMPI-based descrip- 
tion with similar descriptions compiled by 
caseworkers following brief and more ex- 
tended contact. The average coefficients 
were computed through z transformation. 
The procedure for testing the hypothesis 
was as follows: The correlation of Judges' 
MMPlI-based descriptions with casework- 
ers' diagnostic and criteria descriptions were 
compared for each judge; if the judge's 
description correlated more highly with the 
criterion than with the diagnostic descrip- 
tion, a plus was tallied; if the judge's de- 
scription correlated more highly with the 
diagnostic than with the criterion descrip- 
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TABLE 27 


MEAN CORRELATIONS OF JupGEs’ MMPI-BasED 
MOTHER DESCRIPTIONS WITH CASEWORKERS’ 
DIAGNOSTIC DESCRIPTIONS AND CASE- 
WORKERS' CRITERION DESCRIPTIONS 


Criterion 
Diag- Cri- r higher 
Judges* | О Sorts | nostic | terion 
N r r N ? 
I^ 8 .20* 32 | 7 .03 
IIb 9 .21* .41*** | 8 .02 
IV 8 .21* e| 6 |.14 
V 9 .28** | .38%** | 8 | .02 
VI 8 .24** .34*** | 7 .03 
All sorts 42 .23** .36*** | 36 | .001 


, * Excluding Judges III and VII who derived Q-sort descrip- 
tions of adolescents only. The results of these descriptions can 
be found elsewhere (Marks, 1959b). 
b MMPI expert. 
* Significant at .05 level. 
** Significant at .01 level. 
*** Significant at .001 level. 


tion, a minus was tallied; if the judge's 
description correlated the same with both 
the diagnostic and the criterion description 
(i.e., there-was no difference in the magni- 
tude of the coefficient), a tie was recorded ; 
separate tallies were made for each judge 
and then combined for different compari- 
sons. 

An examination of Table 27 shows that 
the total mean correlation of judges’ 
MMPI-based descriptions with caseworkers’ 
diagnostic descriptions was .23, and that 
the mean correlation of the former with 
caseworkers’ criteria descriptions was .36. 
The two coefficients are reliably greater 
than zero at the .01 and .001 levels, re- 
spectively. In Column 5 of Table 27 the 
number of plus signs (ie, criterion 7 
higher than diagnostic 7) is entered for 
each judge. Column 6 reports one-tailed 
probabilities associated with the occurrence 
of values in Column 5. It can be seen from 
Column 6 that for four out of five judges 
the individual probabilities lead to a rejec- 
tion of the null hypothesis of no difference 
in the comparative accuracy of the two de- 
sctiptions at the .03 level or beyond. Fur- 
thermore, in 36 out of 42 total compari- 
sons, judges’ descriptions agreed more 


closely with the criterion. For such an ex- 
treme split, the Binomial test (corrected for 
continuity) yields a very low associated 
probability under the null hypothesis of no 
difference between the two descriptions. 
Since the hypothesis predicted the direction 
of the difference, the region of rejection is 
one-tailed and the probability value is .001. 


Hypothesis Ша. Hypotheses Ша and 
IIIb posited that judges’ MMPI-based de- 
scriptions of mothers would exceed chance 
expectancy. Since the hypotheses are re- 
lated, the findings are presented jointly and 
appear in Table 28. In Column 3 of Table 
28 are entered the mean correlations with 
the criteria of mother stereotype descrip- 
tions compiled by five MMPI judges. The 
average correlation of these descriptions 
was .21 which is reliably greater than zero 
at the .05 level. For comparative purposes, 
the mean correlations with the criteria of 
judges’ MMPI-based descriptions are pre- 
sented in Column 9. The average correla- 
tion of these descriptions was .36 which is 
reliably greater than zero at the .001 level. 
Tn Column 4 are entered the number of plus 
signs (ie, MMPI-based r higher than 
stereotype 7) for each judge. Column 5 
reports the one-tailed probabilities asso- 
ciated with the occurrence of values in 
Column 4. From these data it can be seen 
that for three of the five judges the indi- 
vidual probabilities lead to a rejection of the 
null hypothesis of no difference between the 
stereotype and MMPI-based descriptions at 
the .09 level or beyond. Although the Sign 
test did not yield probabilities greater than 
chance for two of the judges, the average 
correlation of their MMPI-based descrip- 
tions was greater in magnitude than that of 
their stereotypes in both instances. The 
important test of Hypothesis Ша, however, 
involved the combined performance of five 
judges. Table 28 shows that in 31 out of 
42 such comparisons, the accuracy of 
judges’ MMPI-based descriptions surpassed 
that of their own steretoypes. Since the 
hypothesis was directional, the region of re- 
jection is one-tailed and the probability 
value is .002. On the basis of combined 
data, Hypothesis IIIa found support. 
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TABLE 28 


MEAN CORRELATIONS WITH THE CRITERION OF JuDGES’ MMPI-BAsED MOTHER DESCRIPTIONS, JUDGES’ 
STEREOTYPE MOTHER DESCRIPTIONS, AND CASEWORKERS’ COMPOSITE STEREOTYPE 
MOTHER DESCRIPTIONS 


Judges’ Mother Composite Mother Judges’ 
ae Q Sorts Stereo. r higher Stereo. r higher MMPI 
ийре" — 
N r N ? r N b r 
| 
15 8 .09 7 .03 .35*** 4 .64 324% 
п» 9 .29+жж 6 .25 .33*** 7 .09 Aper 
IV 8 .14 7 .03 .18* 5 .36 „З1жжж 
У 9 .24** 7 .09 .23** 7 .09 „З8жжж 
VI 8 .27** 4 .64 .30*** 5 .36 Sasex 
Totals 42 .21* 31 .002 .28*** 28 .02 .36*** 


* Excluding Judges III and VII who derived Q-sort descriptions of adolescents only. The results of these descriptions can be 


found elsewhere (Marks, 1959b). 
b PI expert. 
* Significant at .05 level. 
** Significant at .01 level. 
*** Significant at .001 level. 


Hypothesis IIIb. Columns 6, 7, and 8 of 
Table 28 report data which pertain to Hy- 
pothesis IIIb. In Column 6 of Table 28 are 
entered the mean correlations with the 
criteria of composite mother steretotype de- 
scriptions compiled by three of four clinic 
caseworkers. The average correlation of 
these descriptions was .28 which is reliably 
greater than zero at the .001 level. In Col- 
umn 7 are presented the number of plus 
signs (i.e., MMPI-based r higher than com- 
posite stereotype r) for each judge. Col- 
umn 8 contains the one-tailed probabilities 
associated with the occurrence of values in 
Column 7. An examination of these data 
for individual judges shows that in at least 
three instances the probability values do not 
lead to rejection of the null hypothesis of 
no difference between the accuracy of the 
descriptions. Again, however, the impor- 
tant test involved the total performance of 
five judges. It can be seen from Table 28 
that in 28 out of 42 such comparisons, the 
accuracy of judges’ MMPI-based descrip- 
tions surpassed that of caseworkers’ com- 
posite stereotypes. The one-tailed proba- 
bility associated with the occurrence of this 
split led to a rejection of the null hypothesis 
of no difference at the .02 level. On the 


basis of the combined performance of five 
judges, Hypothesis IIIb found support. 


Hypothesis IIIc. Hypotheses IIIc and 
Ша posited that caseworkers’ diagnostic 
descriptions of mothers would exceed 
chance expectancy. Since both hypotheses 
predicted greater agreement between case- 
workers’ diagnostic and criterion descrip- 
tions than between their stereotype and 
criterion descriptions, the findings for each 
are considered jointly. In Column 3 of 
Table 29 are entered the mean correlations 
with the criteria of mother stereotype de- 
scriptions derived by four clinic casework- 
ers. These coefficients ranged from .27 to 
49 and averaged .41. All are reliably 
greater than zero at the .001 level. For 
comparative purposes, the mean correlations 
with the criteria of caseworkers’ diagnostic 
descriptions (1.е., their average sort-resort 
correlations) are presented in Column 9. 
These coefficients ranged from .62 to .69 
and averaged .65. An inspection of the cor- 
relations in Columns 3 and 9 shows con- 
siderable variability among the former and 
relative stability among the latter. The dis- 
crepancy in the magnitude of caseworkers’ 
own stereotype “loadings” was somewhat 
disconcerting. Coefficients as high as .49 


ps 


| 
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(Caseworkers W and Z) indicate, ceteris 
paribus, that a sizable percentage of vari- 
ance associated with the reliability of two 
caseworkers’ descriptions of mothers can 
also be accounted for by reference to their 
prevailing conception of the “average clinic 
mother.” 
In Column 4 of Table 29 the number of 
plus signs (ie., diagnostic r higher than 
stereotype 7) is reported for each case- 
worker. Column 5 presents one-tailed 
probabilities associated with values in Col- 
umn 4. These data show that for each of 
he four caseworkers the associated proba- 
bilities lead to a rejection of the null hypo- 
hesis of no difference in the accuracy of 
he stereotype and diagnostic descriptions 
at the .06 level or beyond. Moreover, in 
37 out of 42 total comparisons the accuracy 
of diagnostic descriptions surpassed that of 
the stereotypes. The probability value as- 
sociated with such an extreme split is .001. 
These findings support Hypothesis IIIc. 
Hypothesis IIId. In Columns 6, 7, and 
8 of Table 29 are reported data which per- 
tain to Hypothesis IIId. The coefficients in 
Column 6 were obtained by correlating the 
combined stereotype (composite) descrip- 
tion of three caseworkers with each cri- 
terion description of the fourth and then 
averaging the correlations (through 2 trans- 
formation) to obtain a mean coefficient. The 
four average composite stereotypes thus ob- 
tained ranged from .18 to .34 with a mean 


of .28. Three of the four coefficients were 
reliably greater than zero at the .001 level. 
One was reliably greater than zero at the 
05 level. Although the mean composite 
stereotype coefficients in Column 6 are as 
variable as the mean individual stereotype 
coefficients in Column 3, the magnitude of 
the latter exceeds that of the former in 
every instance. This discrepancy in the 
accuracy of composite and individual stereo- 
type “loadings” suggests that individual 
caseworkers ascribe a greater percentage of 
"average mother" variance to criterion 
mother descriptions than is warranted. 

In Columns 7 and 8 of Table 29 the num- 
ber of plus signs (i.e., diagnostic r higher 
than composite stereotype r) and the proba- 
bilities associated with these values are en- 
tered for each caseworker. The data for in- 
dividual caseworkers lead to a rejection of 
the null hypothesis of no difference in the 
comparative accuracy of the composite 
stereotype and diagnostic descriptions at the 
02 level or beyond. Moreover, the proba- 
bility associated with 41 positive of 42 total 
values is .001. Thus, on the basis of both 
individual and combined results Hypothesis 
Ша was supported. 


Hypothesis IV 


Hypothesis IV posited that judges’ de- 
scriptions of children derived from parent 
personality test data would better predict 


TABLE 29 


MEAN CORRELATIONS WITH THE CRITERION OF CASEWORKERS' DIAGNOSTIC, STEREOTYPE, 
AND COMPOSITE STEREOTYPE MOTHER DESCRIPTIONS 


Mother Mother 

Caseworkers' r higher Composite r higher Case- Ў 
Caseworker | О Sorts Stereo. Stereo. workers 
Resort 

N r N ГА т N ГА r 
W 15 .A9* 11 .06 .33** 14 .001 .63** 
X 11 37** 11 .001 .28** 11 .001 .T0** 
M 10 .2T** 9 .01 .18* 10 .001 .62** 
Z 6 .49** 6 .02 .34** 6 .02 .65** 
Totals 42 ESL 37 .001 28** 41 .001 .65** 


1% Significant at .05 level. 
* Significant at .001 level. 
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criterion descriptions compiled by clinic 
therapists than diagnostic descriptions com- 
piled by clinic psychologists. 

Table 30 presents data which pertain to 
Hypothesis IV. In Column 4 of Table 30 
are entered mean correlations with the cri- 
teria of descriptions of children extra- 
polated from parent MMPI profile data. 
Entries in Column 3 were obtained by cor- 
relating and averaging each judge's de- 
Scriptions with similar descriptions derived 
by clinic psychologists. 

An inspection of Column 3 shows that 
the mean correlations of judges with psy- 
chologists ranged from .09 to .32 and aver- 
aged .22. Two of the five mean coefficients 
are reliably greater than zero at the .001 
level; two are reliable at the .05 level, and 
one is not significant. In Column 4 are en- 
tered the mean correlations (validity coeffi- 
cients) of judges with therapists. These 
mean coefficients ranged from .06 to .41 and 
averaged .28. Three are reliably greater 
than zero at the .001 level, one at the .01 
level, and one is not significant. An inspec- 
tion of the mean correlations for each judge 
reveals that, with exception of Judge I, all 
were in closer agreement with therapists 
than with psychologists. 


TABLE 30 


MEAN CORRELATIONS or JUDGES’ EXTRAPOLATED 
CHILD DESCRIPTIONS WITH PSYCHOLOGISTS’ 
DIAGNOSTIC DESCRIPTIONS AND THERAPISTS’ 
CRITERION DESCRIPTIONS 


Criterion 
Diag- Cri- r higher 
Judge* | О Sorts | nostic | terion 

N r r N b 
ДУ 7 .09 .06 35311950 
п» 8 .20* .24** 5 .36 
IV 8 .2 9 | .369** | 7 | .03 
V 8 .32*** | „419+ | 7 | .03 
VI 7 .19* .29«** | 4 .50 
All sorts 38° | .22* .28*** | 26 | .02 


a Excluding Judges III and VII who derived Q-sort descrip- 
tions of adolescents only. The results of these descriptions can 
bus Isewhere (Marks, 1959b). 


expert. ? A 
° Excluding four cases due to inaccurate criterion. 
5 level. 


In Column 5 of Table 30 the number of 
plus signs (i.e., criterion r higher than diag- 
nostic r) are entered for each judge. Col- 
umn 6 presents the one-tailed probabilities 
associated with the occurrence of values in 
Column 5. It can be seen from Column 6 
that for two out of five judges the individ- 
ual probability values lead to a rejection of 
the null hypothesis of no difference in the ` 
comparative accuracy of the two descrip- 
tions at the .03 level. For three judges the 
individual probability values fail to reach an 
acceptable level. The important test of the 
hypothesis, however, concerned the total 
performance of five judges. An inspection 
of this total in Row 6 shows that in 26 out 
of 38 comparisons, judges’ descriptions | 
agreed more closely with the criterion. For 
such a split the Binomial test yields a rela- 
tively low probability under the null hypoth- 
esis of no difference. Since the hypothesis 
predicted the direction of difference, the 
region of rejection is one-tailed and the _ 
probability value is .02. Hypothesis IV was 
supported. 


Hypothesis IVa. Hypotheses IVa and IVb 
posited that judges' descriptions of children 
extrapolated from parent personality test 
data would exceed chance expectancy. Since 
the two hypotheses are related, the findings 
are presented jointly and appear in Table 
31. In Column 3 of Table 31 are entered 
the mean correlations with the criteria of 
child stereotype descriptions compiled by 
five MMPI judges. These correlations 
ranged from .15 to .24 and averaged .20. 
All but one mean coefficient (.15) is reliably 
greater than zero at the .05 level or beyond. 
For comparative purposes the mean correla- 
tions with the criteria (validity coefficients) 
of judges’ extrapolated child descriptions 
are entered in Column 9. These coefficients 
ranged from .06 to .41 and averaged .28. 
Three of the five coefficients are significantly 
greater than zero at the .001 level, one 15 
reliable at the .01 level, and one is not sig- 
nificant. An examination of these coeffi- 
cients shows that, again with exception of 
Judge I, all mean correlations with the сїї- 
teria of judges’ child descriptions were 
greater in magnitude than were correspond- 
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ing coefficients based upon judges’ own 
stereotypes. 4 

In Column 4 of Table 31 are entered the 
number of plus signs (ie., MMPlI-based r 
higher than stereotype 7) for each judge. 
Column 5 reports the one-tailed probability 
values associated with the occurrence of 
signs in Column 4. It can be seen that for 
three out of five judges the individual 
probability values lead to a rejection of the 
null hypothesis of no difference in the ac- 
curacy of the stereotype and extrapolated 
child descriptions at the .06 level or beyond. 
As the important test of Hypothesis IVa 
concerned the total performance of five 
judges, these data are reported separately 
in Row 6. Row 6 shows that in 29 out of 
38 total comparisons, the accuracy of 
judges' extrapolated child descriptions sur- 
passed that of their own stereotypes. For 
such a split, the Binomial test yields a prob- 
ability value of .001. Hypothesis IVa found 
support. 


Hypothesis IVb. In Columns 6, 7, and 8 
of Table 31 data are entered which pertain 
to Hypothesis IVb. In Column 6 of Table 
31 are given the mean correlations with the 
criteria of composite child stereotype de- 
scriptions derived by three of four clinic 


psychologists. These mean coefficients 
ranged from .16 to .28 and averaged .23. 
Two coefficients are reliably greater than 
zero at the .001 level, one at the .01 level, 
one at the .05 level, and one is not signifi- 
cant. Again, with exception of Judge I, all 
mean correlations with the criteria of com- 
posite child stereotypes were lower in mag- 
nitude than corresponding validity coeffi- 
cients of judges’ MMPI-based descriptions. 
In Column 7 of Table 31 are entered the 
number of plus signs (ie., MMPI-based r 
higher than composite stereotype r) for 
each judge. In Column 8 are reported one- 
tailed probability values associated with 
signs in Column 7. An examination of these 
values for individual judges shows that in 
only one instance (Judge V) does the as- 
sociated probability lead to a rejection of 
the null hypothesis of no difference in the 
accuracy of the two descriptions. Row 6 
presents data concerning the combined per- 
formance of five judges. It can be seen 
from these data that in 26 out of 38 com- 
parisons the accuracy of judges’ child de- 
scriptions surpassed that of psychologists’ 
composite stereotypes. The probability as- 
sociated with this split is .02. Thus, on the 
basis of the combined performance of five 
judges, Hypothesis IVb found support. 


TABLE 31 


MEAN CORRELATIONS WITH THE CRITERION OF JUDGES’ EXTRAPOLATED CHILD DESCRIPTIONS, JUDGES' 
STEREOTYPE CHILD DESCRIPTIONS, AND PSYCHOLOGISTS’ COMPOSITE 
STEREOTYPE CHILD DESCRIPTIONS 


Child Child 

Judges" r higher Composite r higher Judges’ 
Judge* Q Sorts Stereo. Stereo. MMPI 

N r N p r N > r 

атта 
I5 .24«* 2 AA .24** 2 d. .06 
IIb : 115 6 .14 .16 6 14 .24** 

IV 8 .18* 7 .03 .18* 6 14 „Збжжж 
у 8 .21* 8 .004 2e 7 .03 AL eee 
VI 7 .24+* б .06 .28*** 5 .23 . 29s 
"Totals 38* .20* 29 .001 .23** 26 .02 28s 


* Excluding Judges III and VII who derived Q-sort descriptions of adolescents only. The results of these descriptions can be 


found. elsewhere (Marks, 1959b). 
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Hypothesis V 


Hypothesis V posited that psychologists" 
diagnostic evaluation-based ^ descriptions 
would better predict the criterion than de- 
scriptions of the same children derived from 
personality test data of their parents. Find- 
ings which relate to this hypothesis are pre- 
sented in Table 32. In Table 32 are given 
the mean correlations with the criteria of 
descriptions of children derived by clinic 
psychologists and by MMPI judges. Entries 
in Column 4 were obtained by correlating 
and averaging each psychologist's diagnostic 
descriptions with criteria descriptions de- 
rived by clinic therapists. These coefficients 
ranged from .38 to .42 and averaged .39. 
All are reliably greater than zero at the .001 
level. In Column 3 are entered mean cor- 
relations with the criteria of child descrip- 
tions extrapolated by judges from MMPI 
profiles of parents. These coefficients ranged 
from .21 to .35 and averaged .29. Two co- 
efficients are reliably greater than zero at 
the .001 level and two at the .05 level. Both 
mean coefficients are reliably greater than 
zero at the .001 level. An examination of 
the coefficients in Column 4 shows that the 
descriptions compiled by psychologist train- 
ees agreed more closely with the criteria 


TABLE 32 


MEAN CORRELATIONS WITH THE CRITERION OF 
PsycHOLOGISTS’ CHILD DIAGNOSTIC DESCRIPTIONS 
AND JUDGES’ EXTRAPOLATED CHILD 


DESCRIPTIONS 
Psy- Psycholo- 
cholo- gists’ 
Psychol- Q Judges’ | gists’ r higher 
ogist Sorts Child Child 

N r r N ? 
А 17 .30** .38** | 10 31 
B 8 .35** .39** | 5 .36 
C» 4 .22* .42** | 3 b 

Ds 4 cob .40** | 2 b 
Totals 33° .29** .39** | 20 15 


» Third-year clinical psychologist trainee. 
ъ Small size of N precluded application of the Sign test. 
с Excluding four cases due to inaccurate criterion and five 
cases in which psychologist served as therapist. 
* Significant at .05 level. 
** Significant at .001 level. 


than did those of staff psychologists. More- 
over, the coefficients of trainees' descriptions 
show greater discrepancy when compared 
with judges than do coefficients of descrip- 
tions derived by clinic staff. In Column 5 
are given the number of plus signs (psy- 
chologists' r higher than judges’ r) for each 
psychologist. Column 6 presents the proba- 
bilities associated with corresponding values 
in Column 5. Unfortunately, the small size 
of the sample for trainees precluded appli- 
cation of the Sign test for their data. The 
probability values associated with the per- 
formance of staff psychologists failed to 
reach an acceptable level. An inspection of 
Row 5 shows that, even when combined, 
the data fail to yield a probability value that 
would lend support to the hypothesis. Thus, 
on the basis of both individual and com- 
bined results, Hypothesis V was rejected. 


Hypothesis Va. Hypotheses Va and Vb 
posited that psychologists’ descriptions of 
children would surpass chance expectancy. 
Data pertaining to these hypotheses appear 
in Table 33. In Column 3 of Table 33 are 
entered the mean correlations with the cri- 
teria of child stereotype descriptions derived 
by clinic psychologists. These coefficients 
ranged from .15 to .27 and averaged .24 (cf. 
20 for judges). One coefficient is reliably 
greater than zero at the .001 level, two at 
the .01 level, and one is not significant. For 
comparative purposes the mean correlations 
with the criteria of psychologists' diagnostic 
descriptions are entered in Column 9. These 
coefficients ranged from .38 to .42 and aver- 
aged .39 (cf. .28 for judges). 

In Column 4 of Table 33 the number of 
plus signs (ie. diagnostic r higher than 
stereotype r) are reported for each psy- 
chologist. Column 5 gives the one-tailed 
probability values associated with signs in 
Column 4. Again, the small size of the 
sample for trainees precluded application of 
the Sign test for their data. The probability 
values associated with the performance of 
staff psychologists revealed that only the 
descriptions of Psychologist A. were reliably 
greater than chance. The combined total of 
the four psychologists appears in Row 5. It 
can be seen from Row 5 that in 27 out of 


| — 


ASSESSMENT OF DIAGNOSIS IN CHILD GUIDANCE 29 


TABLE 33 


MEAN CORRELATIONS WITH THE CRITERION OF PSYCHOLOGISTS’ DIAGNOSTIC, STEREOTYPE, 
AND COMPOSITE STEREOTYPE CHILD DESCRIPTIONS 


: Child Child Psycholo- 
Psychologists" r higher Composite r higher gists’ 
Psychologist | Q Sorts Stereo. pro EMEN ereo Child 
N r N p r N p) T 
A 17 .26** 13 02 .22* 12 07 „З8 «жж 
в 8 .22** 6 34 sarees if 03 239 
С* 4 ‚21*** 4 b .18* 4 b A2 
р* 4 215 4 b .21*** 3 b .40*** 
Totals 33° .24** 27 .001 23** 26 .001 .39*** 


» Third-year clinical psychologist trainee. 
b Small size of N precluded application of the Sign test. 


Excluding four cases due to inaccurate criterion and five cases in which psychologists served as therapist, 


* Significant at .05 level. 
** Significant at .01 level. 
‘ht Significant at .001 level. 


33 comparisons the diagnostic descriptions 
surpassed the stereotypes. The Binomial 
yielded a probability value of .001. On the 
basis of the combined data for four psy- 
chologists, Hypothesis Va found support. 


Hypothesis Vb. In Columns 6, 7, and 8 
of Table 33 are entered data which relate 
to Hypothesis Vb. The coefficients in Col- 
umn 6 were obtained by correlating the com- 
bined (composite) stereotype descriptions of 
three psychologists with the criteria and 
then averaging the correlations to obtain 
mean coefficients. Correlations of four aver- 
age composite stereotypes with the criteria 
ranged from .18 to .27 and averaged .23 
(cf. 23 for judges). Two of the four co- 
efficients were reliably greater than zero at 
the .001 level and two at the .05 level. In 
Columns 7 and 8 the number of plus signs 
(ie, diagnostic y higher than composite 
stereotype 7) and the probabilities asso- 
ciated with these values are entered for each 
Psychologist. Again, the small size of the 
sample for trainees precluded an applica- 
tion of the Sign test for their data. The 
probability values associated with the per- 
formance of staff psychologists leads to a 
rejection of the null hypothesis of no differ- 
ence in the comparative accuracy of the 
composite stereotype and diagnostic descrip- 
tions at the .007 level or beyond. The 


probability associated with the combined 
plus values of all psychologists is .001. Hy- 
pothesis Vb was supported. 


DiscussroN? 


To recapitulate, the study was designed, 
in part, to assess the efficiency of various 
kinds of data available to child guidance 
practitioners for consideration in making 
decisions. The sources of data were both 
test and nontest. The procedure employed 
permitted comparisons of inferences drawn 
from the MMPI with inferences drawn 
from clinical interviews of mothers and 
from psychological examinations of chil- 
dren. 


Accuracy of Clinicians’ Patient Descriptions 


Considerable attention was given to the 
selection of items used in the study. The 
essential features of the design (i.e., the 
sampling of both “normal” and pathological 
statements from the broad domain of per- 
sonality, and obtaining therapist [criterion] 
ratings of item “pertinence”) insured a 
heterogeneous pool of presumable content- 
valid items. Thus, having “built” reliability 


* A more extensive discussion of the findings 
can be found in the original paper. 
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into the pool, it was not surprising to find 
that subsequent estimates of stability, espe- 
cially of the child criterion, were of sub- 
stantial size. The most noteworthy finding 
among the various subset resorts obtained, 
was the relative (and absolute!) magnitude 
of the mean stability estimates of child de- 
scriptions by psychologists and judges. 
These coefficients, and the number of days 
separating sort from resort were ‚77 for 
psychologists and .75 for judges, for 2 and 
14 days, respectively. Of lesser interest, 
perhaps, but of import to an issue dis- 
cussed below were the stability estimates of 
mother descriptions. These descriptions, 
separated in time by 3 months, and by 14 
days, yielded mean coefficients of 65 and 
77 for caseworkers and judges, respec- 
tively. 

Turning now to validity, certain findings, 
among which were estimates of the accuracy 
of the criterion, question the credibility of 
psychotherapists in matters of criterion im- 
port. Placing overly high confidence in the 
“truth” value of therapists’ judgments on 
the basis of reputed experience, expertness, 
skill, or the like, e.g., “enlisting the cooper- 
ation of . . . training analysts" (Silverman, 
1959, p. 19), while a fashionable expedient, 
may prove to be as much a source of pro- 
fessional security as the confidence (once) 
placed in diagnostic techniques! It is 
puzzling, indeed, why some writers (e.g. 
Kogan, 1954, p. 666) are so confident in 
their appraisal of the criterion problem. 

Another source of disillusionment may 
be found in viewing reliability as a surro- 
gate for validity. For example, the coeffi- 
cient of stability (intrasort reliability) can 
be manipulated to yield an estimate of the 
upper limit of validity. Such an estimate, 
however, turns out to be empirically unreal- 
istic, since ло validity coefficients even ap- 
proximate this magnitude. It seems obvious 
that the criterion is sorely in need of vali- 
dation itself. 

Unfortunately, the situation is not much 
improved by eliminating the major source 
of error in the criterion. The findings are 
all too clear on this point. For, despite a 
substantially high mean stability coefficient 


of .87, and, despite a correction for inac- 
curacy of the criterion, therapist-psycholo- 
gist agreement over the total sample of 33 
children averaged only .39, and ranged 
from .05 to .67. Thus, even though the de- 
sign precluded uncontaminated therapist- 
psychologist descriptions, less than 20% of 
the predicted variance could be accounted 
for by the latter’s descriptions. Moreover, 
only 22% of the variance could be ac- 
counted for when the error variance was 
partialed out, i.e., when a complete correc- 
tion for attenuation was made. These find- 
ings offer little cause for rejoicing. What 
they suggest, is that diagnosticians (psychol- 
ogists) and therapists, in dealing with the 
same patients, in the same clinic, under pre- 
sumably optimal conditions, and sharing the 
same information, cannot communicate with 
accuracy the salient features of patients’ _ 
personalities. In short, they operate from 
incongruent frames of reference, which, has 
the net effect of rendering the psychological : 
examination of children, if not the diag- - 
nostic process itself, a somewhat indefen- 
sible procedure. | 
Furthermore, the findings offer little sup- 
port for the commonly held belief that the | 
collection, combination, and integration of 
various sources of data (ie. a procedure 
actually carried out in the clinical situation) 
enables the clinician to make statements 
about personalities which are significantly — 
more accurate than those made on the basis 
of blind interpretation of much limited data. 
Consequently, there is little support for the 
notion that knowledge about, and "under- 
standing of," are linearly related, or that 
“understanding of," and “treatment suc 
cess,” are necessarily related! Attention 
should be called to the fact that at least ОП 
therapist had very little knowledge about 
(as defined) and presumably, therefore, 
"understanding of," any of his patients. 
Nevertheless, all of his patients wet 
judged to have improved (by standards. 
other than the therapist’s, of course). 
The obvious possibility comes to min | 
that, given certain kinds of optimally 
weighted data, any surplus contributes mote 
error than true variance. This issue has 
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been discussed in detail by Meehl (1957) 
and needs no elaboration here. Suffice it to 
say that, even though the optimal weights 
are yet largely unknown, the evidence fa- 
voring the use of certain kinds of informa- 
tion (and disfavoring the use of other 
kinds) is becoming quite clear. For exam- 
ple, notwithstanding the relative superiority 
(note: relative, not absolute) of the MMPI 
as compared with its competitors, the clini- 
cian might take more seriously the findings 
of Hathaway (1956) and others which 
suggest that such classificatory items as 
age, sex, socioeconomic status, education, 
occupational status, etc. (ie, “minimal 
data") have predictive power capable of 
clinically important generalizations (e.g. 
Caldwell, 1958; Kostlan, 1954; Sines, 1957; 
Taft, 1955), Evidence to this effect can 
be found in the present study by comparing 
the relative magnitude of the various data- 
based Q correlations with that of the ster- 
eotypes. 

In summary, when judged in absolute 
terms, the accuracy of the various data- 
based predictions ranges from substantial 
to poor. Most impressive is the stability of 
the child criterion (suggesting that the O 
array can be sorted reliably). Somewhat 
less impressive is the magnitude of case- 
workers' sort-resort correlations. The aver- 
age coefficient over the total of 42 cases was 
only .65, suggesting that whatever infor- 
mation serves as the basis for decisions, is 
of only moderate reliability. The MMPI- 
based mother descriptions yielded an aver- 
age validity coefficient of .36, which is 
moderate. Perhaps of greater interest is 
the fact that in 36 out of 42 comparisons 
(p = .001), judges’ MMPI-based descrip- 
tions of mothers correlated more highly 
with caseworkers’ descriptions after 10 
hours than after only 3. The convergence 
results of MMPI-based descriptions of 
children, though in the direction of those 
of mothers, yielded relatively low mean 
coefficients. Most discouraging, were the 
low congruency findings for psychologists 
and therapists who, for the most part, 
shared similar data (pro dolor). 


Accuracy of Clinicians’ Stereotype 
Descriptions 


A second consideration in evaluating effi- 
ciency, requires a comparison of the ac- 
curacy of predictions made from the data 
relative to that of prediction made on the 
basis of chance. The findings pertaining 
to such comparisons in the present study 
offer several points of interest. 

As expected, the average correlations of 
each of the various subset stereotype de- 
scriptions with the criterion were all sig- 
nificantly greater than zero. However, as 
not expected, the magnitude of the average 
correlations with the criterion of case- 
workers’ and of psychologists’ individual 
stereotypes (r = .41 caseworkers, r = .24 
psychologists) surpassed that of independ- 
ent composite descriptions (r = .28 case- 
workers, r = .22 psychologists). It will be 
recalled that the composite stereotype de- 
scriptions were derived from the four child 
stereotypes compiled by psychologists and 
the four mother stereotypes compiled by 
caseworkers and, after fractionally omitting 
one from each set of four, by forming a 
composite of the remaining three. Accord- 
ingly, four composite child stereotypes and 
four composite mother stereotypes (con- 
sisting of three descriptions each) were ob- 
tained for the express purpose of compar- 
ing each data-based prediction with a pre- 
sumably more accurate stereotype (a more 
stringent estimate of chance) that was free 
of each clinician’s stereotype bias. Both 
theoretically and empirically, when com- 
pared with an individual stereotype, a com- 
posite stereotype (an actuarial description) 
should yield a more accurate prediction. 
What then is the implication when it does 
not? In the present study, especially for 
caseworkers, the obvious implication is that 
a sizable percentage of the variance asso- 
ciated with the accuracy of their descrip- 
tions of mothers can also be accounted for 
by reference to their prevailing conception 
of the “average clinic mother.” In other 
words, failing to discriminate, they de- 
scribe mothers “pretty much the same.” 
That this is unwarranted (i.e., that mothers 
are not “pretty much the same”), can be in- 
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ferred from the magnitude of the discrep- 
ancy between the individual and composite 
mother stereotype coefficients (viz., between 
an r = 28 and an r = 41). 

In summary, clinicians can predict better 
than chance not only on the basis of their 
own perceptions of the average clinic child 
and mother, but on the basis of a presum- 
ably more accurate perception comprised of 
a consensus of psychologist and caseworker 
opinion. 


Major Implications 


An important implication of the findings 
is that the clinical utility of any diagnostic 
device, insofar as it is a function of the 
clinician-as-an-instrument variable, demands 
that the device be validated both within the 
setting in which it is used and by the clini- 
cian who chooses to use it. The clinician 
simply cannot legitimately assume that the 
validation of data by their "clinical useful- 
ness" in his own particular setting or that 
even the formal validation of data in 
another similar setting, is sufficient in the 
absence of objective evidence to the effect 
that the clinician's own use of the data 
“nays off." It cannot be seriously contended 
that the present findings support the con- 
tinued use of certain projective techniques 
which on the basis of their use with chil- 
dren are of such low or moderate validity ; 
and hence, make little appreciable contri- 
bution to patient welfare (cf. their use with 
adults) (Kostlan, 1954; Little & Shneid- 
man, 1959; Silverman, 1959; Sines, 1957). 

The state of affairs which quite possibly 
exists in many child guidance clinics today 
requires an evaluation not only of the de- 
vices employed in the psychological evalua- 
tion of children, but of the diagnostic 
process itself. Since the latter is considered 
the sime qua non for planning effective 
treatment and since the present findings 
seriously question the efficiency of the proc- 
ess, it is disheartening to contemplate the 
effect upon patients that results from de- 
cisions based upon such an endeavor, 
in addition to the sheer waste of clinical 
time. Insofar as clinicians (psychiatrists, 
social workers, and psychologists) are will- 


ing to question their premises and conclu- 
sions, psychologists should be willing to in- 
vestigate the validity of these premises and 
conclusions. In short, the urgency of his 
therapeutic function should not deter the 
psychologist from his more important func- 
tion—research. 

Another important implication of the 
findings concerns the extension of psycho- 
logical testing in general and of MMPI 
testing in particular, to parents of child 
guidance patients. Despite the fact that 
psychological tests have been administered 
to children since the inception of the child | 
guidance movement, it was only recently 
that the diagnostic testing of parents was 
even discussed programatically (Rosenz- 
weig & Cass, 1954). One source of diffi- 
culty seems to stem from the alleged threat, 
both to the parent and to the caseworker, 
that parent testing seems to imply. Suffice 
it to say that the present study offered no 
evidence that parents felt threatened by 
such an endeavor. What it did suggest, : 
however, was first that it might well be the — 
clinician rather than the parent who feels 
threatened ; and second, was that many pat- 
ents require treatment which extends be- 
yond the "basic structure of the casework 
process" (Perlman, 1953, p. 308) and, ad- 
mittedly, casework techniques in child guid- 
ance are not entirely appropriate for the 
diagnosis or treatment of psychiatric dis- | 
orders in adults (Lippman, 1956). Con- 
sequently, clinics should either offer such 
service or should refer such cases to more 
appropriate sources—assessment is inevita- 
ble. 


SUMMARY AND CONCLUSIONS 


The present study was designed to inves- 
tigate personality characteristics of parents 
of child guidance patients and to assess the 
relative efficiency of various sources of data ` 
available to clinicians for consideration 11 
making pretreatment decisions. 

Forty-eight cases, representing 9076 of 
consecutive referrals accepted for treatment 
between March 15, 1958 and July 31, 1958: 
comprised a normative group. Of these, \ 
cases in which the mother and child re 
mained in treatment for a minimum of 
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hours and in which the child was free of 
gross organic (cerebral) impairment and/or 
mental deficiency, comprised a validation 
group. 

The average child was a Caucasian, 10- 
year-old boy in the fourth grade. His in- 
telligence was average but he was referred 
to the clinic by the school because of aca- 
demic deficiency. He was described by his 
parents as nervous and fearful, and was 
reported to have sleep disturbances and poor 
peer relationships. The average child was a 
Protestant and came from a home that had 
not been disrupted by death, separation, or 
divorce. His mother and father were 37 and 
41 years of age, respectively. Both parents 
had received a tenth grade education. The 
child's father was employed in a skilled 
clerical, or business (Class IIT) occupation. 

Case assignments, diagnostic methods, 
and treatment procedures followed estab- 
lished clinic routine, with the exception that 
MMPI testing was extended to include par- 
ents. Upon completion of diagnostic eval- 
uations, psychologists and caseworkers com- 
piled Q-sort personality descriptions of 
children and of mothers, respectively. The 
modal psychological examination of chil- 
dren consisted of an intelligence estimate, 
three personality measures, and some index 
of academic achievement. The modal eval- 
uation of mothers consisted of three clinical 
interviews. 

Testing the various hypotheses involved 
comparisons of MMPI data from clinic 
mothers and fathers with similar data from 
their respective normative groups, and com- 
parisons of the relative accuracy of (a) Q- 
sort personality descriptions of children, de- 
tived by psychologists from psychological 
examination data of children; (b) Q-sort 
personality descriptions of children, derived 
by nonclinic judges from blind readings of 
the MMPI profiles of children’s parents; 
(c) Q-sort child stereotype descriptions, de- 
tived by clinic psychologists from their own 
conception of the “average” clinic child; 
(d) Q-sort child stereotype descriptions, de- 
tived by nonclinic judges from their own 
conception of the “average” clinic child; 
(e) Q-sort child composite stereotype de- 
scriptions, derived by clinic psychologists 


from their combined conception of the 
"average" clinic child; (f) Q-sort person- 
ality descriptions of mothers, derived by 
clinic caseworkers from clinical interview 
data of mothers; (g) Q-sort personality de- 
scriptions of mothers, derived by nonclinic 
judges from blind readings of the MMPI 
profiles of mothers; (Л) Q-sort mother 
stereotype descriptions, derived by clinic 
caseworkers from their own conception of 
the “average” clinic mother; (i) Q-sort 
mother stereotype descriptions, derived by 
nonclinic judges from their own conception 
of the “average” clinic mother; and (7) Q- 
sort mother composite stereotype descrip- 
tions, derived by clinic caseworkers from 
their combined conception of the “average” 
clinic mother. 

The criteria with which the various de- 
scriptions were compared for predictive 
accuracy were Q sorts compiled by clinic 
therapists with whom the children and their 
mothers had had an average of 10 hours of 
interview contact. The design precluded un- 
contaminated clinic Q-sort descriptions. Cri- 
terion descriptions of mothers were com- 
piled by the same caseworkers who had 
interviewed and had Q sorted the mothers 
diagnostically. Moreover, no attempt was 
made to deprive therapists of psychometric 
data about children, for such data are nor- 
mally available to them and therapists’ judg- 
ments are presumably founded to some 
extent upon information available in psy- 
chological reports. Only the MMPI was 
withheld from clinic personnel. The latter, 
in turn, was made available to seven non- 
clinic student and expert judges. 

Two separate prediction tasks were re- 
quired of the judges by the design. Q-sort 
descriptions of mothers were derived from 
blind readings of their MMPI profiles, and 
Q-sort descriptions of children were de- 
rived from MMPI profiles of their parents. 
An 11-category, forced, quasinormal distri- 
bution was used. Stereotype mother and 
child descriptions in addition to reliability 
data were obtained from every clinician. 

The Q array was empirically derived by 
the clinic staff from approximately 2,000 
items sampled from various sources. In its 
final form, the array consisted of 135 items 
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which had been judged a reasonable repre- 
sentation of the broad domain of person- 
ality, appropriate for children and adults of 
both sexes, composed of pertinent clinical 
information, ratable by each member of the 
clinic staff, and of adequate intersubject 
variability for both mother and child. Ap- 
proximately 25% of the array contained 
genotypic items (7596 phenotypic). Twenty- 
five percent of the array contained non- 
pathological statements. Of these, 7% were 
neutral and 1896 were descriptive of opti- 
mal adjustment. 

The findings support the following con- 
clusions : 

1. Parents of child guidance clinic pa- 
tients differ from general population adults 
with respect to most personality variables 
measured by the MMPI. 

When compared with general population 
females, clinic mothers show significantly 
higher scores on K, Hs, D, Hy, Pd, Pa, Pt, 
Sc, and Si with the strongest trends on Hy 
and Pd. Moreover, abnormal scores occur 
in 5096 of clinic mothers' profiles as com- 
pared with percentage frequencies of 14 
and 75 among normal and psychiatric 
groups, respectively. An examination of 
clinical scale high points reveals five code 
patterns (13-31, 24-42, 20-02, 34-43, and 
9) which together represent 70% of all pro- 
files in the clinic mother sample. 

The findings for clinic fathers reveal 
scores significantly higher than general pop- 
ulation males on К, Hs, D, Hy, Pd, Mf, Pt, 
and Sc with the strongest trends on D and 
Hy. Abnormal scores occur with a percent- 
age frequency of 44 among fathers as com- 
pared with 15 and 75 among normal and 
psychiatric groups, respectively. An exam- 
ination of clinical scale high points reveals 
four code patterns (13-31, 20-02, 4, and 59- 
95) which together represent 85% of all 
profiles in the clinic father sample. 

In 77% of clinic families, at least one 
pareut demonstrates symptoms and person- 
ality adjustment patterns which are com- 
monly characteristic of psychiatric abnor- 
mality. Ratings of case movement reveal 
Codes 13-31 and 9 for mothers, and Codes 
4 and 59-95 for fathers associated with 


rated case improvement, and Codes 24-42 
and 34-43 for mothers, and Code 13-31 for | 
fathers associated with rated case unim- 
provement. Code 20-02, for either parent. 
and regardless of elevation, is associated 
with an unfavorable prognosis. Similarly, 
a high Pd minus Ma differential is a ma- 
lignant sign. The average Pd minus Ma 
scores of mothers of cases rated improved, 
partially improved, and unimproved are 28, 
15.6, and 204, respectively. 


2. The MMPI, whether used with 
mothers as a source of data for generating 
inferences about mothers or whether used 
with parents as a source of data for gener- 
ating inferences about children, is more effi- 
cient than the diagnostic process itself. 


3. Personality descriptions of mothers 
derived by judges from blind readings of 
MMPI profiles of mothers correlate more 
highly with similar descriptions derived by | 
caseworkers following 10 interview hours 
than with descriptions of the same mothers 
derived by the same caseworkers following | 
only 3 interview hours. 


4. Personality descriptions of children 
derived by judges from blind readings of 
MMPI profiles of children’s parents are no 
less accurate than similar descriptions de 
rived by clinic psychologists following com- 
pletion of routine psychological examina- 
tions of the same children. 


5. Clinicians (judges, psychologists, and 
caseworkers) can compile personality de- 
scriptions of mothers and of children which 
correlate more highly with a criterion than 
do stereotype descriptions compiled by the | 
same clinicians, 

6. Clinicians (judges, psychologists, and 
caseworkers) can compile personality de- л 
scriptions of mothers and of children whic 
correlate more highly with a criterion than 
do composite stereotype descriptions com 
piled by other clinicians. 


7. A moderately high percentage of vat 
ance associated with the reliability of case 
workers’ descriptions of mothers can also be 
accounted for by reference to caseworkers) 
own prevailing conception of the “average į 
clinic mother.” 
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8. Psychotherapists, if used as a cri- 
terion, are in need of validation themselves. 


9. The collection, combination, and inte- 
gration of various sources of data, i.e., what 
is actually done in the clinical situation, 
does not enable the clinician to make infer- 
ences about personality which are signifi- 
cantly more accurate than inferences based 
upon blind interpretation of much limited 
data. 


10. The overall agreement between psy- 
chologists and therapists, in dealing with the 
same patients, in the same clinic, and shar- 
ing the same information, is discouragingly 
low. In short, they operate from incon- 
gruent frames of reference, which, has the 
net effect of rendering the psychological ex- 
amination if not the diagnostic process itself 
an extravagant waste of clinical time. 


1l. In addition to engendering person- 
ality descriptions which are useful in treat- 
ment planning and in case disposition, 
MMPI code patterns yield practical actu- 
arial data which predict response to clinic 
treatment. 


12. The validity of any diagnostic de- 
vice, insofar as it is a function of the clini- 
cian-as-an-instrument variable, demands 
that the device be validated both within 
the setting in which it is used and by the 
clinician who chooses to use it. The clini- 
cian cannot legitimately assume that the 
validation of data by their clinical useful- 
ness in his own particular setting or that the 
formal validation of data in another similar 
setting is sufficient in the absence of evi- 
dence to the effect that his own use of the 
data “pays off.” 
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APPENDIX A 


BEHAVIOR RATING SCALE 


(Total symptoms and referral complaints) 


Total Total 
Behavior Rating N % Behavior Rating N % 

Unable to achieve in school 22 46 | Excessive lying 4 8 
Fearful 12 25 | Lack of effort: listless, apathy 4 8 
Nervous, anxious 12 25 | Physical handicaps: hearing, 
Sleep disturbances: nightmares, vision, etc. 4 

insomnia, etc. 12 25 | Dreamy, day dreams 4 
Poor peer relationships 12 25 | Tics, habit spasms, mannerisms 4 
Passive: overly conforming, Stealing: delinquent 3 

unable to fight back - 1 23 | Defiant, nonconforming, 
Tense, overly sensitive 11 23 disobedient 3 6 
Shy, withdrawn 10 21 | Destructive 3 6 
Enuretic, soiling 10 21 | Sex difficulties 3 6 
Cries easily 9 19 | Temper tantrums, screaming 3 6 
Psychosomatic symptoms 9 19 | Excessive fantasy 3 6. 
Eating difficulties 9 19 | Excessive fighting 2 4 
School phobia 8 17 | Truant 2 4 
Perfectionistic 8 17 | Fire setting 2 4 
Bizarre behavior 7 15 | Immature thumb sucking 2 4 
Hyperactive, impulsive, Emotionally overreactive 2 4 

unpredictable 7 15 | Suicide attempt 2 4 
Difficult to control, unmanageable 7 15 | Compulsive acts, obsessed with 
Unable to concentrate 6 13 specific topics 2 4 
Aggressive 5 10 | Does not speak 2 4 
Mischievous, provocative 5 10 | Negativistic 2 4 
Speech difficulties: articulation, Runs away 2 4 

phonetics, etc. 5 10 | Demands attention 1 2 
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APPENDIX B 
THE Q Array 


1. Reports difficulty in thinking (e.g., cannot 
concentrate). 

2. Tends to be ruminative and overideational. 

3. Obsessive thinking present. 

4, Is perfectionistic: is compulsively meticu- 
lous. 

5. Is socially extraverted (outgoing). 

6. Manifests hypochondriacal tendencies, i.e., 
is excessively concerned about physical condi- 
tion and functioning, is hypersentive to and 
overevaluates little pains and dysfunctions. 

7. Is self-dramatizing: histrionic. 

8. Is excitable. 

9. Complains of weakness or is easy fatigued. 

10. Has feelings of hopelessness. 

11. Has a high aspiration level for self: is 
ambitious, wants to get ahead. 

12. Judges self and others in conventional 
terms like “popularity,” “the correct thing to 
do,” “social pressures,” etc. 

13. Tends to arouse liking and acceptance in 
people. 

14. Has a rapid personal tempo: thinks, 
talks, moves at a fast rate. 

15. Is cheerful. 

16. Is vulnerable to real or fancied threat: 
generally fearful, is a worrier. 

17. Experiences difficulty in giving orders or 
making demands and requests of others. 

18. Has "diagnostic" insight: awareness of 
the descriptive features of own behavior. (Ex- 
amples: that certain symptoms are neurotic, 
that one is not liked by others, that one tends 
to distort in certain ways; that one is depressed, 
that one shows poor judgment in such-and- 
Such ways, that one underachieves.) 

19. Exhibits good heterosexual adjustment. 

20. Is demanding: tends to take the attitude 
"the world owes me a living,” “I have a right 
to be taken care of," etc. 

21. Is unpredictable and changeable in be- 
havior and attitudes. 

22. Is egocentric, self-centered, selfish: seeks 
need-gratifications (“how will this affect me Г) 
with little regard for the happiness and well- 
being of others. 

23. Appears to be poised, self-assured, so- 
cially at ease. 

24. Is defensive about admitting psychologi- 
Cal conflicts: tries to avoid revealing self as 
having psychological conflicts and emotional 
distresses. 


25. Delusional thinking is present. 

26. Has grandiose ideas (extreme is delu- 
sions of grandeur). 

27. Seeks out and tries to relate to parent 
figures. 

28. Is resentful. 

29. Exhibits psychotic tendencies. 

30. Exhibits depression (manifest sad mood). 

31. Tends to delay or avoid action: fears 
committing self to any definite course, is in- 
decisive, vacillating. 

32. Is evasive. 

33. Is irritable. 

34. Tends to transfer blame. 

35. Demands sympathy from others. 

36. Behaves considerately towards others. 

37. Is argumentative. 

38. Presents self as being physically, organi- 
cally sick. 

39. Is suggestable: overly responsive to other 
people's evaluations rather than own. 

40. Tends to be rebellious and nonconform- 
ing. 

41. Thinks and associates in unusual ways: 
has unconventional thought processes (extreme 
is illogical, confused, or bizarre). 

42. Is apathetic. 

43. Has inner conflict about sexuality (dis- 
tinguish from reality problems in this area). 

44. Has inner conflict about emotional de- 
pendency (distinguish from reality problems in 
this area). 

45. Utilizes acting out as a defense mechan- 
ism. 

46. Reacts to frustration intropunitively (i.e., 
punishes self). 

47. Utilizes regression as a defense mechan- 
ism. 

48. Exhibits evidence of narcissism (latent 
or manifest). 

49. Has conflicts about giving. 

50. Has good verbal-cognitive insight into 
own personality structure and dynamics, and 
has a real "feeling" for these insights. Insight 
not defended against by isolation-intellectuali- 
zation. 

51. Has a meed to affiliate with others: i.e., 
to form friendships and associations, to greet 
and converse sociably with others, to join vari- 
ous groups, etc. 

52. Has inner conflicts about self-assertion 
(distinguished from reality problems in this 
area). 
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53. Fears loss of control, feels need to keep 
rigid check on own emotional responses: can- 
not “let go" even when appropriate. 

54. Utilizes projection as a defense mechan- 
ism. 

55. Has a need to achieve: to overcome ob- 
stacles, to exercise power, to strive to do some- 
thing difficult as well and as quickly as possible 
(this is an elementary ego need which may 
alone prompt action or be fused with any other 
need). 

56. Is retentive: has a need to retain pos- 
session of things; to refuse to give or lend; to 
hoard; to be frugal, economical, and miserly. 

57. Places value on intellectual and cog- 
nitive activities, skills, and attitudes. 

58. Utilizes rationalization as a defense 
mechanism. 

59. Defenses are fairly adequate in relieving 
psychological distress. 

60. Tends toward overcontrol of needs and 
impulses: binds tensions excessively, delays 
gratification unnecessarily. 

61. Utilizes intellectualization as a defense 
mechanism. j 

62. Life has included rewarding socialization 
experiences. 

63. Would be organized and adaptive when 
under stress or trauma. 

64. Has a resilient ego-defense system: has 
a safe margin of integration, adequate self- 
control. 

65. Values wealth or material possessions and 
judges self and others in terms of them. 

66. Gets appreciable "secondary gain" from 
symptoms (ie. symptoms function to get per- 
son out of painful, difficult, or stressful situa- 
tions in a socially acceptable way or are other- 
wise rewarding via their manipulations of 
external-relations). 

67. Characteristically pushes and tries to 
stretch limits: sees what can be gotten away 
with. 

68. Is self-defeating: places self in an obvi- 
ously bad light. 

69. Is nervous, tense in manner: trembles, 
sweats, or shows other manifest signs of anx- 
iety. 

70. Is distrustful of people in general: ques- 
tions their motivations. 

71. Is readily dominated by others: is sub- 
missive. 

72. Spends a good deal of time in personal 
fantasy and daydreams: fictional speculations. 
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73. Keeps people at a distance: avoids close 
interpersonal relationships. 

74. Is sensitive to anything that can be con- 
strued as a demand. 

75. Accepts others as they are: is not judg- 
mental. 

76. Is able to sense other person’s feelings: 
is an intuitive, empathetic person. 

77. Is protective of those close to self (place- 
ment of this item expresses behavior ranging 
from overprotection through appropriate nur- 
turance to laissez-faire, unstructuring atti- 
tudes). 

78. Is critical, not easily impressed, skeptical. 

79. Genotype has psychopathic features. 

80. Genotype has schizoid features. 

81. Genotype has hysteroid features. 

82. Genotype has paranoid features. 

83. Genotype has obsessive-compulsive fea- 
tures. 

84. Shows concern over reputation. 

85. Feels there is social stigma attached to 
clinic contact. 

86. Is a serious person who tends to antici- 
pate problems and difficulties, and to look at 
the “dark side” of things. 

87. Is concerned about the qualifications of 
various staff members. 

88. Seems unable to express own emotions in 
any modulated, adaptive way. 

89. There are many “positives” in this case. 

90. Presents a favorable prognosis. 

91. Is overanxious about minor matters and 
reacts to them as if they were real emergencies. 

92. Is a shy, anxious, and inhibited person. 

93. Resorts to escape into fantasy. 

94. Has unresolved Oedipal problems. 

95. Tends to be flippant both in word and 
gesture. 

96. Is open and frank in discussing prob- 
lems. 

97. Has a need to think of self as an un- 
usually self-sufficient person. 

98. Has a tenuous hold on reality. 

99. Has shown ability to talk about con- 
flicts in most areas. 

100. Would be threatened by interpretations 
given early in therapy. 

101. Is suffering from feelings of rejection. 

102. Has a relatively mature superego. 

:103. Tends not to become involved in things: 
passively resistant. Bb: і 


104. Is stereotyped and unoriginal in ар" 


proach to problems. 
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105. Undercontrols own impulses: acts with 
insufficient thinking and deliberation. 

106. Gets along well in the world as it is: 
is socially appropriate in own behavior, keeps 
out of trouble (to be considered as conceptually 
separate from person's intrapsychic state). 

107. Is tearful and/or cries openly. 

108. Emphasizes oral pleasures: is self-in- 
dulgent. 

109. Is consciously guilt-ridden: self-con- 
demning, self-accusatory. 

110. Doesn't seem to be particularly afraid 
of anything. 

111. Is tense, high-strung, jumpy: has an 
overreadiness to respond with startle or ap- 
prehension to unexpected stimulation. 

112. Consistently avoids being put in any 
situation where own performance will be in- 
ferior to that of the others. 

113. Expresses impulses by specific verbal 
"acting out" (e.g. scolding, yelling, cursing, 
etc.). 

114. Exhibits manneristic behavior (tapping 
on table, biting lips, biting nails, wringing 
hands, etc.). 

115. Is afraid of emotional involvement with 
others. 

116. Has a wish (conscious or unconscious) 
to kill people who thwart self in any way. 

117. Has a wish (conscious or unconscious) 
to take others' possessions from them. 

118. Undervalues and consistently derogates 
the opposite sex. 

119. Psychic conflicts are represented in 
somatic symptoms. 

120, Handles anxieties and conflicts by re- 
fusing to recognize their presence. 


121. Possesses a basic insecurity and need for 
attention. Search for “love” is a compulsive or 
neurotic search for security. 

122. Overreacts to danger or makes emer- 
gency responses in the absence of actual dan- 
ger. 

123. Is made anxious or disturbed by im- 
pulses to commit a criminal act or hostile act 
(e.g. desire to stab, beat, or kill someone, to 
set a fire, to mutilate an animal). 

124. Fears of phobias present (include all 
the common fears such as claustrophobia, 
school phobia, etc.). Continuum ranging from 
slight anxiety to severe inhibition of activity 
because of the fears. 

125. Has superior intellectual ability (based 
on clinical observations of functioning level 
only). 

126. Repressive mechanism functions ade- 
quately. 

127. Easy to talk to and get along with in 
this kind of relationship. 

128. Has the capacity for forming close in- 
terpersonal relationships. 

129. Has an exaggerated need for affection. 

130. Ego strength (continuum ranging from 
severe ego weakness through moderate strength 
to exceptionally strong ego development). 

131. Has obsessional character problems. 

132. Has developed defenses which them- 
selves cause suffering. 

133. Is a reliable informant. 

134. Is provocative. 

135. Is “normal,” healthy, symptom free 
(psychologically). 
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HE passage of the National Mental 

Health Act in 1946 represents an im- 
portant milestone in the development of 
community mental health activities in this 
country. For at least four decades prior to 
the passage of this act, interest in mental 
health activities had been growing only 
slowly and sporadically within the profes- 
sions involved. The lack of professional 
manpower, financial resources, and a broad, 
general demand among the population-at- 
large all served to severely restrict the early 
beginnings of community mental health 
activities and interest. However, by the 
early 1940s there had been sufficient prog- 
ress in this development to attract the atten- 
tion of Congress to the potential value of 
mental health activities and to the necessity 
for broad federal support in this area. 
Through the federal support authorized by 
the National Mental Health Act, the growth 
of community mental health activities has 
been sharply accelerated, and these activ- 
ities have greatly expanded to include ever 
more personnel and newer functions. 

This recent and rapid growth of commu- 
nity mental health activities has not been 
without its problems. The great strides 
made in the broad areas of research, serv- 
ice, and training have often been uncoor- 
Ксы шш 


1 This survey was carried out while the senior 
author was a Research Fellow in Psychology in the 
Department of Psychiatry, Harvard Medical 
School and receiving postdoctoral training in the 
Massachusetts General Hospital’s Community Men- 
tal Health Program. 


"Rossi is also at the Boston City Hospital; 
Klein and vonFelsinger are also at the Massachu- 
Setts General Hospital. 


dinated and unbalanced, Services have 
sometimes been slow in changing in the 
light of new research findings, training 
programs have sometimes been slow to 
change to incorporate new service functions 
and research methods, and research efforts 
have been sometimes slow in changing from 
older problem areas to newer ones. 

These discrepancies have existed, and 
exist, within all the mental health profes- 
sions in general, and within psychology in 
particular. Basically а science—with its 
tradition of basic research and freedom of 
choice of area of investigation—psychology 
has always experienced self-doubts in pro- 
viding training for its members to engage 
in professional services and goal oriented 
research. It has been only in the very re- 
cent past that such training has become an 
integral, accepted part of academic psy- 
chology departments. Now, as nonacademic 
psychologists have expanded from such 
activities as individual therapy, and diag- 
nosis into the broader areas of prevention, 
coordination of services, administration, and 
consultation, once again self-doubts have 
arisen within academic training programs 
concerning the wisdom or feasibility of pro- 
viding training for such activities. Like- 
wise, as the attention of field researchers 
has turned to multivariate problems involv- 
ing aggregates of individuals interacting 
within families, neighborhoods, and com- 
munities, academic research training still 
largely does not include training in methods 
and techniques appropriate to such areas of 
investigation. 

As this “training lag” increases, more 
and more individual psychologists and 
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groups of psychologists have become in- 
volved with the question of how this lag 
may be overcome. The writings of such 
individuals as Filmore Sanford (1958), 
Stuart Cook (1958), and Gelfand and 
Kelly (1960), and the proceedings of such 
organizations as the Conference of Chief 
Psychologists in State Mental Health Pro- 
grams (1951-59) have stimulated an in- 
creasing amount of thought and discussion 
concerning this problem. The American 
Psychological Association has responded to 
the increasing interest in this training ques- 
tion by sponsoring a national conference on 
the subject (Strother, 1956) and by ap- 
pointing various subcommittees to study this 
question (APA, 1959). 

The difficulty in resolving this training 
lag is due to many reasons, among which 
is the lack of data concerning the present 
activities of psychologists in community 
mental health and the settings in which 
these activities are being carried out. It is 
generally understood that many psycholo- 
gists are engaging in "new" activities and 
that many psychologists are employed in 
"new" settings; however, to date no specific 
data in this area have been collected. In 
order to make a beginning step in over- 
coming this deficiency, the present survey 
was carried out. In addition to gathering 
data on the activities of psychologists in- 
terested in community mental health, it 
was also the purpose of this survey to learn 
the opinions of these psychologists concern- 
ing the education and training needs of those 
involved in community mental health ac- 
tivities. 


PROCEDURE 


Respondents 


The initial selection of respondents for the sur- 
vey was accomplished by reading through the 
1959 APA Directory in which each member is 
requested to list his main areas of interest and 
his present position. The criteria for selection as 
a respondent was that the member: (a) had ex- 
plicitly expressed an interest in “community mental 
health"; (b) had expressed an interest in some 
closely related area such as "administration of 
mental health services" or "research on mental 
health education programs"; or, (c) his present 
title and position implied active interest in com- 


munity mental health such as “Chief Mental? 
Health Coordinator, State Department of Mental 

Health” or “Assistant Director, State Association 

for Mental Health.” By this procedure, 409 mem- 

bers were initially selected for inclusion in the 

survey. 

All respondents in the survey were asked to 
supply the names of psychologists they knew to 
be “interested or engaged in community mental 
health activities." This request resulted in the 
accumulation of 632 more nonduplicated паш ЖА 
which made a total of 1,041 names of psycholo- 
gists who apparently had an interest in community 
mental health activities. All of the 400 initially 
selected persons were included in the survey, but 
only 206 of the 632 "referred" persons were in- 
cluded because of financial and time limitations— 
which made a total of 615 respondents who were 
sent questionnaires, Out of these 615 respondents, 
38 did not receive the questionnaire for various 
reasons (eg, “moved—no forwarding address,” 
“deceased,” etc.)—which left a total of 577 effec- 
tive respondents. 

From these 577 respondents, 377 completed ques- 
tionnaires were returned. The percentages of replies 
received from the respondents selected by APA: 
Directory listings and from the respondents referred 
by other respondents were equal—65%. Of the 
377 respondents who returned completed question: 
naires, 23 disavowed any active interest in com- 
munity mental health activities, and so the ques- 
tionnaire replies from these 23 respondents were 
omitted from the questionnaire tabulations. There: 
fore, the results of this survey are based on the 
replies received from 354 APA members who have 
expressed an interest in community mental health 
activities. For a description of these respondents 
see Descriptive Data in the Results section. 

Toward the close of the survey, 257 postcards 
were mailed out to those who had not yet repli 
to the questionnaire. Self-addressed, stamped, 
postcards were provided for their replies. From 
the 257 respondents, 138 replies were received. 
the 138 replies, 42 indicated they did not complete 
the questionnaire because they had no interest in 
community mental health activities. The rest 0 
the replies indicated that the respondents had in- 
terest in this area but did not complete the ques- 
tionnaire because of lack of time or some other 
such reason. 

A copy of the questionnaire used is included in 
Appendix A. In addition to the questionnaire 
replies, data concerning the respondents were also 
obtained from the APA Directory listings (e.g 
age, degree, years since degree, etc.). 


Analysis 


In order to facilitate the analysis of many те 
plies to many questions, the replies were ей, 
punched on ЇВМ cards, and tabulated by IBM | 
machines. The statistical analyses were cm 
out using absolute frequencies (see Appendix 
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but percentage frequencies are used in the tables 
in order to make easier the comparison of replies 
received from different sized groups. 

There was reason to believe that four variables 
might have had an influence on the responses of 
each subject: the degree he held, the type of set- 
ting in which he was employed, his main activity 
(defined as spending 6076 or more of one's time 
in the activity), and his opinion on what the term 
"community mental health" denoted. Accordingly, 
these variables were used to separate the respond- 
© ents into subgroups, and a separate analysis of the 
results was made for each subgrouping,® 


RESULTS 


The results of this survey are presented 
under eight headings: Descriptive Data, 
Present Activities, Future Activities (antic- 
ipated), Desired Content Areas (for in- 
clusion in education and training programs), 
Level for Education and Training, Locus 
for Predoctoral Training, Locus for Post- 
doctoral Training, Influences in the Devel- 
opment of Interest in Community Mental 
Health, and, General Comments by Re- 
spondents. 

Under each of the above headings with 
the exception of Present Activities and 
General Comments by Respondents, the re- 
sults are presented under six subheadings: 
Total Group, Degree, Place of Employment, 
Main Activities, Opinion on What “Com- 
munity Mental Health 15,” and Summary. 

Although this method of presenting the 
results invariably leads to an occasional 
repetition in the presentation, it was felt 
that this method would aid in making the 
complex results more comprehensible. 


Descriptive Data 


* Total Group. Tables 1 through 5 sum- 
marize the descriptive data concerning the 
respondents. A noteworthy finding presented 
in Table 1 was that, for the total group, in- 
terest in community mental health devel- 


"Separate analyses were also made of sub- 
groupings based on age, years since final degree 
Was granted, opinion as to the level at which com- 
munity mental health training should be offered, 
and opinion as to the desired locus for commu- 
nity mental health training. The results of these 
analyses, however, are omitted from the present 
Teport in the interest of space. 


^n 


oped approximately a year after the final 
degree was received. From Table 2, it can 
be seen that there was a significant differ- 
ence between the geographical distribution 
of the respondents and the geographical dis- 
tribution of APA members.* There are at 
least two valid interpretations of this differ- 
ence: (a) the survey sample, in comparison 
with APA membership, had too high a 
representation from National Institute of 
Mental Health (NIMH) Regions I and VI 
and too low a representation from NIMH 
Regions II and V; or (b) relatively more 
psychologists from Regions I and VI have 
an interest in community mental health and 
relatively fewer psychologists from Regions 
II and V have such an interest. There is 
no available criterion to decide which inter- 
pretation is the sounder. 


From Tables 3 and 4, it can be seen that 
the respondents were about evenly distrib- 
uted among the various places of employ- 
ment and were about evenly distributed ac- 
cording to main activities. And, it can be 
seen from Table 5, that out of the total 
number of respondents : approximately one- 
fourth believe that "community mental 
health" denotes an attitude, approximately 
one-fourth believe that the term denotes an 
area of interest, and approximately one- 
fifth believe that the term denotes a spe- 
cialty area. 

Degree. There were no significant differ- 
ences between the doctorates and nondoc- 


* The National Institute of Mental Health Re- 
gions are as follows: 

I. Connecticut, Massachusetts, New 
Hampshire, Rhode Island, Vermont. 

II. Delaware, New Jersey, New York, Pennsyl- 
vania. 

III. District of Columbia, Kentucky, Maryland, 
North Carolina, Virginia, West Virginia. 

IV. Alabama, Florida, Georgia, Mississippi, 
South Carolina, Tennessee. 

V. Illinois, Indiana, Michigan, Ohio, Wisconsin, 

VI. Iowa, Kansas, Minnesota, Missouri, Ne- 
braska, North Dakota, South Dakota. 

VII. Arkansas, Louisiana, New Mexico, Okla- 
homa, Texas. 

VIII. Colorado, Idaho, Montana, Utah, Wyo- 
ming. 

IX. Arizona, California, 
Washington, Alaska. 


Maine, 


Nevada, Oregon, 
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TABLE 1 9 
DESCRIPTIVE DATA CONCERNING RESPONDENTS 
Median Median | 
Median Years Years of Percentage . 
Variable N Age Since Interest in of 
Degree “CMH” Doctorates 
Total Subjects 354 39.3 8.3 6.9 79 x 
Degree: ӯ 
Doctorates 279 39.5 7.6 6.6 100 
Nondoctorates 71 38.6 117 8.0 0 
Place of Employment: 
Department of Psychology 44 41.5 9.6 8.4 91 
Other Academic Department 48 40.6 8.6 7.9 94 
Government Agency 49 38.7 7.8 8.2 88 
Hospital or Clinic 105 37.6 6.6 4.5 86 
School 52 43.6 10.5 8.5 52 
Other |. е 51 36.8 8.3 Wak 65 
Main Activities: 
Varied 114 39.5 8.5 6.2 76 
Research and/or Teaching 78 39.4 8.6 8.0 91 
"Therapy and/or Diagnosis 85 37.9 6.9 5531.3 72 
Administration and/or Consultation 68 40.0 8.5 8.8 84 
Opinion on what the term CMH denotes: 
Attitude 124 39.0 7.8 7.6 79 
Interest 101 37.7 8.2 б.1 80 
Specialty 65 43.8 9.2 Ta 77 


Note.—For results of tests of significance, see Appendix B. 


torates in age, years since interest developed 
in community mental health, geographical 
distribution, or opinion on what the term 
“community mental health" denotes (see 
Tables 1, 2, and 5). There were significant 
differences between these two subgroups in 
years since degree, place of employment, 
and main activities (see Tables 1, 3, and 
4). For this sample, there were more doc- 
torates than nondoctorates in academic 
settings, government settings, and hospitals 
or clinics, and more nondoctorates than 
doctorates in school settings and in unclas- 
sified settings (prisons, industry, etc.). 
Also for this sample, there were more doc- 
torates than nondoctorates mainly engaged 
in research and/or teaching, and more non- 
doctorates than doctorates engaged in 
therapy and/or diagnosis. The nondoctor- 
ates have held their final degree for more 
years than the doctorates. 


Place of Employment. The differences 
in geographical distribution of the sub- 
groups employed in different settings were 
not tested, because the x? test resulted in a 
9 X 6 table and the expected frequencies in 
too many cells were too small for a valid 
use of this test. There were significant dif- 
ferences between these subgroups on each 
of the other descriptive variables. In com- 
parison with thosé, employed in other set- 
tings: (а) those employed in school settings 
were older, had held their final degrees 
longer, and had a higher proportion of non- 
doctorates (see Table 1); (b) those em- 
ployed in hospitals or clinics were younger, 
had held their final degrees for a shorter 
period of time, and had developed an in- 
terest in community mental health compara- 
tively recently (see Table 1); (c) more of 
those employed in academic settings were 
mainly engaged in research and/or teaching 


(see Table 4) ; (d) more of those employed 
in government settings were mainly en- 
gaged in administration and/or consul- 
tation (see Table 4); (e) more of those 
employed in schools, hospitals, or clinics 
f were mainly engaged in therapy and/or 
diagnosis (see Table 4) ; and (f) more of 
those employed in academic departments 
'other than psychology were of the opinion 
that "community mental health" denotes an 
area of interest, and less of this subgroup 
were of the opinion that the term denotes 
an attitude (see Table 5). 
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gaged in other activities: (а) those mainly 
engaged in therapy and/or diagnosis had 
developed an interest in community mental 
health comparatively recently, and this sub- 
group had a higher proportion of nondoc- 
torates (see Table 1) ; and (5) less of those 
mainly engaged in research and/or teach- 
ing were of the opinion that the term “‘com- 
munity mental health” denotes an attitude 
(see Table 5): The relationship between 
main activities and place of employment 
(Table 3) has been described above. The 
differences in the geographical distribution 


of the subgroups mainly engaged in differ- 
ent activities are too complex to warrant a 
discussion here (see Table 2). 

Opinion on What Community Mental 
Health Is. There were no significant differ- 


Main Activities. There were no signifi- 
cant differences among the subgroups main- 
engaged in different activities in age and 
ears since degree (see Table 1). How- 
ver, in comparison with those mainly en- 


TABLE 2 


GEOGRAPHICAL DISTRIBUTION OF RESPONDENTS 


National Institute of Mental Health Regions 


Variable N 
I Il IH IV V VI УП | VIII | IX 
D 
; Total Subjects 354 i39) 24% 1% 8% 12% 12%) 6% 3% 14% 
Total APA Directory 17083 8 29 9 5 19 7 5 2 16 
Degree: 
Doctorates 279 11 22 7 9 12 12 6 4 15 
Nondoctorates 71 22 27 7 4 11 11 3 0 14 
Place of Employment: 
Department of Psychology 44 11 23 11 7 9 7 11 4 16 
Other Academic Department 48 12 31 2 4 15 17 6 2 10 
Government Agency 49 20 4 10 14 10 6 8 12 14 
Hospital or Clinic 105 15 16 4 10 13 19 4 $ 15 
Schóol 52 6 48 6 4 17 6 0 0 13 
Other 51 14 25 14 6 6 12 8 0 16 


Main Activities: 


Varied 114 15 18 7 13 10 13 4 3 17 
Research and/or Teaching 78 14 28 6 1 18 11 9 ; 10 
ry and/or Diagnosis 85 9 | 31 6 7 | 11 | 16 3 14 
dministration and/or 
Consultation 68 16 18 12 9 10 6 6 9 15 
Opinion on what the term 
“CMH” denotes: 
Attitude 124 10 25 9 10 14 12 3 3 14 
Interest 101 17 23 9 4 13 16 4 3 12 
Specialty 65 оз AE e [^ 12 SES 6 | 20 


_ Note.— For results of tests of significance, see Appendix B. 
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Note,— For results of tests of significance, see Appendix В. 


TABLE 3 
RESPONDENTS' PLACES OF EMPLOYMENT 
Depart-| Other | 
ment Aca- |Govern-| Hos- 
© Variable N of demic | ment pital | School | Other Not 
Psy- |Depart-| Agency or Known 
chology | ment Clinic 
"Total Subjects 354 12% 13% 14% 30% 15% 14% 1% 
Degree: 
Doctorates 279 14 16 15 32 10 12 1 
Nondoctorates 71 6 4 8 20 35 25 1 
© 
Main Activities: 
Varied 114 15 8 8 39 14 16 0 
Research and/or Teaching 78 31 43 0 9 4 10 3 Р; 
Therapy and/or Diagnosis 85 1 1 1 58 22 14 2 x 
Administration and/or 
Consultation 68 3 4 56 4 16 15 1 l 
Opinion on what the term 
“CMH” denotes: 
Attitude 124 14 3 15 32 16 18 2 
Interest 101 9 22 15 26 15 12 2 
Specialty 65 12 14 15 31 11 15 1 
Note.—For results of tests of significance, see Appendix B. 
TABLE 4 
Main Activities OF RESPONDENTS 
Adminis- 
Research | Therapy | tration No No 
Variable N and/or and/or and/or | Varied | Times | Answer 
Teaching | Diagnosis | Consul- Given 
tation 
Total Subjects 354 22% 24% 19% 32% 1% 2% 
Degree: 
Doctorates 279 25 22 20 31 1 1 
Nondoctorates 71 7 32 15 38 1 6 l 
Place of Employment: 
Department of Psychology 44 54 2 4 39 0 0 
Other Academic Department 48 71 2 6 21 0 0 
Government Agency 49 0 2 78 18 0 2 
Hospital or Clinic 105 7 AT 3 42 0 2 
School 52 6 36 21 31 0 6 
Other 51 16 23 20 35 4 2 
Opinion on what the term j 
“CMH” denotes: 
Attitude 124 12 29 19 35 2 2 
Interest 101 26 24 24 26 1 0 
Specialty 65 21 17 21 35 b 0 
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TABLE 5 
RESPONDENTS’ OPINIONS ON WHAT THE TERM “COMMUNITY MENTAL HEALTH” DENOTES 
ў i No 
Variable N Attitude | Interest | Specialty | Other Answer 
"Total Subjects 354 35% 28% 18% 15% 396 
- Degree: 
Doctorates 279 35 29 18 15 2 
Nondoctorates 71 34 27 20 15 4 
Place of Employment: 
Department of Psychology 44 39 20 18 22 0 
Other Academic Department 48 8 46 19 19 8 
Government Agency 49 39 31 20 10 0 
Hospital or Clinic 105 38 25 19 17 2 
School 52 38 29 13 12 8 
Other 51 43 23 20 14 0 
Main Activities: 
Varied 114 39 23 20 16 $ 
Research and/or Teaching 78 19 33 18 24 5 
Therapy and/or Diagnosis 85 42 28 13 13 2 
Administration and/or 
Consultation 68 35 35 21 7 1 


Note.—For results of tests of significance, see Appendix B. 


‘ences between the subgroups based on 
opinions concerning what the term “com- 
- munity mental health" denotes in years 
Since degree, years since interest developed 
‘in community mental health, proportion of 
7? doctorates to nondoctorates, and in geo- 
- graphical distribution (see Tables 1 and 2). 
"These subgroups did differ significantly in 
| аре, however, with the subgroup believing 
_ that community mental health is a specialty 
being older than the other subgroups. The 
T relations of “opinion” with main activities 
and with place of employment have been 
— described above. 


Summary. In reading further, it would 
б; be well for the reader to keep in mind the 

following generalizations concerning the 
findings thus far: main activities are related 
to place of employment, degree held, and 
-Beographical area; opinion on what the 
term “community mental health” denotes 
‘is related to main activities and place of 
ployment; and place of employment is 
related to degree held. 


Present Activities 


Two statistics were obtained from the 
questionnaire replies relating to present 
activities: percentage of total respondents 
engaging in each activity and median time 
spent in each activity by those who engage 
in that activity. These data are presented 
in Table 6. It can be readily seen from this 
table that a higher percentage of the re- 
spondents engaged in consultation than in 
any other single activity. Administration 
was the second most frequently mentioned 
activity, closely followed by research and 
community education. However, although 
a higher percentage of the respondents en- 
gaged in consultation, an average of only 
12.2 hours per month was spent in this 
activity by these respondents. This is in 
contrast to the average of 30.8 hours per 
month spent in therapy by the respondents 
who engage in therapeutic activities, 

Those who engaged in consultation were 
requested to answer two subquestions: 
“consultation to whom?” and “consultation 
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TABLE 6 
PRESENT ACTIVITIES OF RESPONDENTS 
(N — 354) 
Median 
Percentage | Hours Per 
Activity That Month 
Engage in Spent in 
Activity Activity 
Consultation 85 12.2 
Administration 81 24.4 
Research 77 19.8 
Community Education 76 5.3 
Diagnosis 67 23.7 
Therapy 66 30.8 
In-Service Education 60 1:2 
"Teaching 53 19.9 
"Training 50 14.1 
Other 46 16.6 


for what?" The tabulations of the replies 
to this question are presented in Tables 7 
and 8. The most frequently mentioned con- 
sultees were personnel of caretaker agencies 
and schools. Almost twice as many of the 
respondents reported offering consultation 
to these agencies in comparison with those 
who offered consultation to mental health 
agencies. The most frequently mentioned 
areas for consultation were program plan- 
ning, treatment, and diagnosis. One impli- 
cation from these findings is that a high 
percentage of these psychologists is offering 
consultation to nonmental health personnel 
in the area of program planning. Such ac- 


TABLE 7 


REPLIES TO CONSULTATION TO WHOM” 
By THOsE Мно ENGAGE IN CONSULTATION 


(N = 301) 


Percentage of Those 


Consultee Agencies Doing Consultation 


Caretaker Agencies 60 
Schools 54 
Mental Health Agencies 30 
Industries 5 
Other 8 
No Answer E 


tivities undoubtedly require a thorough 
acquaintance with the problems, functions, 
and methods of nonmental health personnel 
and their agencies, and imply that acquaint- 
ance with these factors would be an impor- 
tant part of any training program for com- 
munity mental health. 

A further interesting finding was that a 
higher percentage of these psychologists re- 
ported consulting in the areas of treatment 
and diagnosis than the percentage that re- 
ported consulting with mental health per- 
sonnel. One obvious conclusion from this 
finding is that consultation is being offered 
to nonmental health personnel in the area 
of treatment and diagnosis. Since it is 
highly unlikely that consultation would be 
offered to nonmental health personnel in 
the traditional methods of treatment and 
diagnosis, it is probable that modified meth- 
ods of treatment and diagnosis are involved 
in this consultation. Again, knowledge of 
such modified methods apparently would 
be an important part of community mental 
health training programs. 

Those respondents who reported some 
administrative duties were requested to an- 
swer two subquestions: “setting for admin- 
istration?” and "type of program adminis- 
tered?” The replies to this question are 
presented in Tables 9 and 10. It can be 
seen from Table 9 that the most frequently 
mentioned settings for administration were 


TABLE 8 


REPLIES TO "CONSULTATION IN WHAT AREAS” 
BY THOSE WHO ENGAGE IN CONSULTATION 


(N = 301) 

Areas for Percentage of Those 

Consultation Doing Consultation 
Program Planning 46 
Treatment 43 
Diagnosis 40 
Research 25 
Personnel Practices 20 
Training 11 
Education Planning 8 
Other 14 
No Answer 10 


] 
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TABLE 9 


REPLIES TO "SETTING FOR ADMINISTRATION” 
BY THOSE WHO ENGAGE IN ADMINISTRATION 


‘(N = 285) 
Percentage of 
Administration Setting Those Doing 

Administration 
Clinics 25 
Universities 24 
State, County, City System 19 
Institutions 14 
Schools 14 
National or Regional System 3 
Other 2 
No Answer 3 


clinics and universities; and it can be seen 
from Table 10 that the most frequently 
mentioned type of program administered 
was a service program. A strong subjective 
impression gained during the coding and 
tabulating of the replies to these questions 
indicated that very few of the programs 
administered in university settings involved 
service, and that most of the programs ad- 
ministered outside a university setting did 
involve service to some degree. Despite 
the nature of the programs, however, it is 
evident from the tabulated data that a sub- 
stantial percentage of these psychologists 


TABLE 10 


REPLIES ТО “TYPE ОЕ PROGRAM ADMINISTERED” 
BY THosE WHO ENGAGE IN ADMINISTRATION 


(N = 285) 
Percentage of 
Type of Program Administered | Those Doing 
Administration 
Service Alone 41 
Research Alone 14 
Training and Education Alone 14 
Service and Training 10 
Service and Research 6 
Research and Training 5 
Service, Research, and Training 8 
Other 1 
No Answer 1 


carry out administrative duties in a variety 
of settings; and from Table 6 it can be seen 
that those respondents who do have admin- 
istrative duties spend an average of 24.4 
hours a month, or approximately one-fourth 
of their time, in this activity. 

The third most frequently mentioned ac- 
tivity for these psychologists was research. 
"Those psychologists who reported some re- 
search activity were requested to answer 
three subquestions: "type of research?" 
“subjects?” and “role in research?" How- 
ever, the replies to the latter two subques- 
tions were too varied to allow for systemat- 
ic coding and tabulation, and so the results 
of these questions cannot be presented here. 
The replies to the first subquestion were 
also varied and only a very gross coding of 
replies was possible. The results of this 
coding and tabulation are presented in Table 
11. The classification of “action, oper- 
ational, and/or program assessment” in this 
table includes all the types of reported re- 
search that implied a direct, practical appli- 
cation of the results of the research to a 
specific problem by a specific group. The 
classification of “other” included all other 
types of research—which mainly consisted 
of research of a theoretical nature. In a 
separate tabulation involving only those re- 
spondents whose main positions were in 
university settings and who reported some 
research activity, it was found that 53% of 
these latter respondents engaged in “theoret- 
ical” research, 24% engaged in “practical” 


TABLE 11 


REPLIES TO “TYPE ОЕ RESEARCH” 
ву THOSE WHO ENGAGE IN RESEARCH 


(N = 274) 
Percentage of 
Type of Research Those Doing 
Research 
Action, Operational, and/or Pro- 35 
gram Assessment Alone 
Program Assessment with Other 20 
Types 
Other Types Alone 38 
No Answer 7 


10 


research, and 20% engaged in both types 
of research. Therefore, while all the re- 
spondents who reportedly engaged in re- 
Search were about equally divided between 
practical and theoretical research, those in 
university settings were more likely to be 
involved in theoretical research while those 
outside a university setting were more likely 
to be engaged in practical research. 

A high percentage of the respondents re- 
portedly were involved in community edu- 
cation activities, but the average time re- 
portedly spent in these activities was only 
5.3 hours a month (see Table 6). Those 
respondents reporting some community edu- 
cation activities were requested to answer 
two subquestions: “audience for commu- 
nity education?” and “methods used for 
community education?” The tabulations of 
the replies to these subquestions are pre- 
sented in Tables 12 and 13. It can be seen 
from Table 12 that community education 
efforts were most frequently directed at 
PTA groups, and that outside of these 
groups, community education efforts were 
about evenly distributed among other seg- 
ments of the community. From Table 13 
it is obvious that lectures are far-and-away 
the most frequently used means of impart- 
ing mental health information to the com- 
munity in spite of the doubts many people 
have regarding the efficiency of this meth- 
od. However, a little over half of those 


TABLE 12 


REPLIES TO "AUDIENCE FOR COMMUNITY 
Epucation” By THOsE WHo ENGAGE IN 
Community EDUCATION Activities 


(N = 268) 
Community Education Percentage of Those Doing 
Audience Community Education 
PTA Groups 47 
Civic Groups 34 
Community-at-Large 28 
Church Groups 23 
Parent Groups 23 
Professional Groups 20 
Other 7 
No Answer 10 
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TABLE 13 


REPLIES TO “COMMUNITY EDUCATION METHODS 
Оѕер” By ТнозЕ WHO ENGAGE IN 
COMMUNITY EDUCATION ACTIVITIES 


(N = 268) 
Percentage of Those 
Community Education Doing Community 
Methods Education 

Lectures 78 
Discussion Groups 58 
Radio and/or Television 30 
Publications 20 
Workshops 6 
Other 10 
No Answer 7 


respondents engaging in community educa- 
tion efforts reportedly utilize the discussion 
group method in these efforts. This is a 
sizable percentage when the time and ef- 
fort involved in this method are considered. 

Diagnostic and therapeutic activities were 
reported with about equal frequencies by 
the respondents, but the average time spent 
in therapy was somewhat higher than the 
average time spent in diagnosis (see Table 
6). Those respondents who reportedly en- 
gaged in therapy were requested to answer 
three subquestions: "setting for therapy ?" 
"subjects for therapy?" and “therapeutic 
methods used?" The replies to the second 
subquestion proved to be too ambiguous and 
varied for classification purposes and were 


TABLE 14 


REPLIES TO “SETTING FOR THERAPY" BY 
THosE Wuo ENGAGE IN THERAPEUTIC ACTIVITIES 


(N = 232) 


Percentage of Those 


Setting for Therapy Doing Therapy 


Clinics 47 
Private Practice 39 
Institutions 18 
Schools 14 
Other 6 


No Answer 2 


E 
| 


{ 
ү 
| 


ТАВГЕ 15 


REPLIES TO "THERAPY METHODS ОЗЕР” BY 
| TRosE Мно ENGAGE IN THERAPEUTIC ACTIVITIES 
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TABLE 16 


REPLIES TO “SETTING FOR DIAGNOSIS" BY 
Тноѕе WHo ENGAGE IN DIAGNOSTIC ACTIVITIES 


not tabulated. The tabulations of the replies 
to the first and third subquestions are pre- 
sented in Tables 14 and 15, respectively. 
| It can be seen from Table 14 that about 
40% of those engaging in therapy re- 
4 portedly carry out some of this therapy in 
private practice. The author has no readily 
available data with which to compare this 
figure, but it would appear that 4076 is a 
high percentage to be engaging in private 
practice to some degree. For clarification, 

_ it should be mentioned that only two of the 
respondents were in full-time private prac- 
tice, and that the great majority of those 
reporting private practice activities were 
carrying out these activities in addition to 
other full-time positions. The unexpected 
large amount of private practice activities 
undoubtedly accounts for the finding that 
_ the highest average time per month spent in 
_ any one activitiy was spent in therapy (see 
— Table 6). 

It can be seen from Table 15, that short- 
term individual therapeutic methods were 
employed by the largest percentage of those 
engaging in therapy, although a sizable 
percentage employed the more traditional 
long-term individual methods. In a further 
breakdown of these data, it was found that 
_ only 9% of those engaging in therapy em- 
-— ployed long-term individual methods alone, 
_ While 25% employed short-term individual 
_ methods alone. It would appear that short- 
_ term methods are found to have greater 

utility than long-term methods for those en- 
- Bàging in community mental health. 


(N = 232) (N = 238) 
Percentage of Those Percentage of Those 
Therapy Methods Used Doing Therapy Setting for Diagnosis Doing Diagnosis 

Short-Term Individual 81 Clinics 42 
Long-Term Individual 63 Private Practice 34 
Group 34 Institutions 20 
Other 1 Schools 20 
Мо Answer 3 Other 8 

No Answer 2 


Those engaging in diagnostic activities 
were requested to answer two subquestions : 
"setting for diagnosis?" and "purpose of 
diagnosis?" The tabulations of the replies 
to these subquestions are presented in Ta- 
bles 16 and 17, respectively. It can be seen 
from Tables 16 and 17 that the settings for 
diagnostic activities were quite similar to 
those for therapeutic activities. This find- 
ing was not unexpected, and no further 
discussion of these settings seems war- 
ranted. Likewise, there is little about the 
purposes of diagnosis (see Table 17) that 
would warrant discussion beyond the find- 
ing that almost half of those engaging in 
diagnosis do so in relation to program plan- 
ning. This finding is of interest because the 
usual diagnostic training is not oriented 
towards this purpose. 


TABLE 17 


REPLIES TO "PURPOSES FOR DriAGNOSIS" BY 
Тнозк Мно ENGAGE IN DIAGNOSTIC ACTIVITIES 


(N = 238) 
Percentage of Those 
Purposes of Diagnosis Doing Diagnosis 
Aid Treatment 67 
Program Planning 47 
Discharge Planning 14 
Vocational Guidance 8 
Other 7 
No Answer 7 
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TABLE 18 


REPLIES TO “SETTING FOR IN-SERVICE EDUCATION” 


BY ТнозЕ WHo ENGAGE IN IN-SERVICE 
EDUCATION ACTIVITIES 


(N — 209) 


Setting for In-Service 


Percentage of Those Doing 


Education In-Service Education 
Clinics 26 
Institutions 25 
Schools 20 
Government Systems 15 
Social Agencies 4 
Universities 2 
Other 7 
No Answer 6 


It can be seen from Table 6, that approxi- 
mately 6096 of the respondents had some 
in-service education duties, and that this 
6096 spent an average of only 7.2 hours 
а month in this activity. This group of re- 
spondents was requested to answer two 
subquestions: "setting for in-service edu- 
"audience 
education?" The tabulations of the replies 
to these subquestions are presented in Ta- 
bles 18 and 19. The most noteworthy find- 
ing in Table 18 is that only 496 of these 
respondents provided in-service education 
within social agencies. 
from the literature that this percentage 
would be much higher than it was. Two 


cation?" and 


for 


obvious possible reasons for this finding are 


TABLE 19 


in-service 


It was expected 


that either activity in this area is not as 
prevalent as one would gather from the 
literature or that the question in the ques- 
tionnaire was interpreted differently than 
was intended— with some respondents nam- 
ing the setting out of which they operate 
rather than the setting in which they carry 
on their in-service education activities. 
However, the findings presented in Table 
19 are more consistent with expectations. 
From this table it can be seen that 7796 of 
the respondents engaging in in-service edu- 
cation activities provided this service to 
nonmental health personnel (nurses, min- 
isters, teachers, etc.) in comparison with 
49% providing the service to mental health 
personnel (psychiatrists and psychiatric so- 
cial workers). A note should be made 
here that the respondents were requested to 
list any activity involving the educating and 
training of psychologists under the headings 
of either "teaching" or "training" and to 
not include such activity under the heading 
of "in-service education." This request was 
made because it is believed that the methods 
and process of providing in-service educa- 
tion to other psychologists differ in a few, 
but fundamental, ways from providing in- 
service education to nonpsychologists. The 
extent to which the respondents engaged in 
the in-service education of nonpsychologists 


psychologists for community mental health 
activities. 


TABLE 20 


REPLIES TO “SETTING FOR TEACHING" BY 


REPLIES TO "AUDIENCE FOR IN-SERVICE EDUCATION 
EFFORTS” By THOSE WHO ENGAGE IN IN-SERVICE 
EDUCATION ACTIVITIES 


(N = 209) 


Percentage of Those 


In-Service Education Doing In-Service 


Audience Education 
Caretakers 17 
Mental Health Personnel 49 
Other 3 
No Answer 5 


THOSE Мно ENGAGE IN TEACHING ACTIVITIES 
(N — 187) 


Setting for Teaching 


Percentage of Those 
Doing Teaching 


Department of Psychology 
Department of Education 
Department of Psychiatry 
Other Higher Academic 
Nonacademic 

Other 

No Answer 


indicates that this would be another fruit- 
ful area for inclusion in the preparation of 
48 
12 
9 
29 
4 | 
3 
С 


el 
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TABLE 21 


REPLIES TO "CounsE AREAS TAUGHT” BY 
Тнозк WHo ENGAGE IN TEACHING ACTIVITIES 


(N = 187) 


TABLE 22 


REPLIES ТО “LEVEL or COURSES TAUGHT” BY 
ТнозЕ WHO ENGAGE IN TEACHING ACTIVITIES 


(N = 187) 


Percentage of Those 


Course Areas Doing Teaching 


Clinical 50 
General 38 
Educational 28 
Mental Health 21 
Social 15 
Guidance and Counseling 4 
Industrial 3 
Other 5 
` No Answer 2 


Percentage of 
Those Doing 
Teaching 


Level of Courses 


Undergraduate Alone 23 
Graduate Alone 25 
Undergraduate and Graduate 35 
Postgraduate Alone 3 
Graduate and Postgraduate 6 
All Levels 2 
Other 5 
No Answer 1 


Approximately 50% of the respondents 
engaged in some teaching activities, and 
about the same percentage engaged in train- 
ing activities (see Table 6). Those who en- 
gaged in teaching were requested to answer 
three subquestions: "setting for teaching ?” 
“courses taught?” and, “level of courses 
taught?” The answers to these subquestions 
are presented in Tables 20, 21, and 22, re- 
spectively. Those who engaged in training 
were requested to answer two subquestions : 
"setting for training?" and, "areas for 
training?" The replies to the latter sub- 
question proved too varied for systematic 
coding so the results are not presented here. 
The tabulations of the replies to the first 
Subquestion are presented in Table 23. 

An interesting facet of the results pre- 
sented in Table 20 is that 48% of the re- 
spondents who engage in teaching (or 90 
respondents) do some teaching in depart- 
ments of psychology, while only 12% of all 
the respondents (or 42 respondents) are 
employed in departments of psychology 
(see Table 3). These findings are interest- 
ing because of the often heard statement 
that departments of psychology should 
make use of nonacademic psychologists in 
teaching specific courses to take advantage 
Of their experience gained "in the field." 
Although the data presented are by no 
Means clear-cut, it does appear that many 
PSychologists interested in or engaged in 
Community mental health activities are be- 


ing given the opportunity to present their 
views to students. 

The category of “other” used in Table 6 
included a great many different activities. 
However, the most usual type of activity 
mentioned in this category had to do with 
memberships in various professional or 
civic groups, e.g., "Treasurer of State Psy- 
chological Association," or "serving on a 
city commission dealing with the aged." 
Other types of activity reported were of 
a more specific nature such as "writing a 
book," "studying," "entertaining visiting 
firemen," and so on. 

Summary. It would appear from the 
data presented thus far that psychologists 


TABLE 23 


REPLIES TO “SETTING FOR TRAINING" BY 
Tuose WHO ENGAGE IN TRAINING ACTIVITIES 


(N = 173) 
Percentage of 
Setting for Training Those Doing 
Training 

Clinics 38 
Institutions 24 
Universities 20 
Schools 10 
Government Systems n 
Other 4 
No Answer 3 


14 А. M. ROSSI, D. C. KLEIN, J. M. vosFELSINGER, ann T. F. A. PLAUT 


interested in or engaged in community men- 
tal health are chiefly characterized by two 
activities: consultation and administration. 
Four out of five of the respondents re- 
ported some activity in each of these areas. 
It was found that much of the consultation 
is with nonmental health personnel in the 
areas of program planning and nontradi- 
tional methods of therapy and diagnosis. 
A sizable percentage of these respondents 
also participate in research, and the charac- 
teristic feature of their research is that it 
is as likely to be practical research, as theo- 
retical research. Many of these respondents 
carry on community education activities, 
but the average time spent in this activity 
is only a half-day a month. There is a 
tendency for these community education 
efforts to be aimed at PTA groups, but 
many other segments of the community are 
included to varying degrees. Lecturing is 
still the most commonly used method of 
imparting mental health information, with 
the discussion group method being the sec- 
ond most commonly used method. 

About two out of three of the respond- 
ents carry on some therapeutic or diag- 
nostic activities. Many of these activities, 
however, are likely to be carried on in pri- 
vate practice outside the respondents' main 
positions. Their therapy is characterized by 
the use of short-term methods in preference 
to long-term methods, and their diagnosis 
is often for the purpose of program plan- 
ning as well as for aiding treatment. 

About half of the respondents reported 
some activity in each of the areas of in- 
service education, teaching, and training. 
Much of the in-service education is with 
nonmental health personnel, but the re- 
mainder of the in-service education and the 
teaching and training is largely directed 
toward mental health colleagues. It would 
appear that the respondents have ample 
opportunity to impart their views to their 
colleagues within the mental health profes- 
sions. 


Future Activities 


The respondents were asked to rank or- 
der those activities they felt they will be 


doing more of in the future. They were | 
further asked to make these rankings on 
the basis of what they would like to do 
more of combined with what they would 
be required to do more of, i.e., a combina- 
tion of desire with reality factors. Tabula- 
tions of the replies were made of: the total 
number of times each activity was included 
in the rankings (whatever the rank as- 
signed), and the total number of times 
each activity was assigned the rank of 1, 2, 
or 3. The first tabulations are not included 
here in the interest of space, but reference 
will be made to these tabulations from time 
to time when appropriate. For the second 
set of tabulations, the decision was made to. 
include the first three ranks (rather than 
just the first rank or the first and second 
ranks) because some respondents did not 
rank their choices but rather just placed a 
check mark in front of their choices—and 
most of these latter respondents chose no 
more than three activities. Therefore, by 
basing the tabulations on the first three 
ranks, the replies from most of the respond- 
ents who did not rank their choices could 
be included in the tabulations and not dis- 
carded. The results of these tabulations are 
presented in Table 24. 

Total Group. The most striking feature 
of the results presented in Table 24 is that 
the ranks of the most frequently mentioned 
future activities by the total respondents 
are similar to the ranks of the most fre- 
quently mentioned present activities (see 
Table 6). That is, the highest percentages 
of the respondents were presently engaged 
in consultation, administration, and re- 
search; and the highest percentages of the 
respondents indicated that they anticipate 
doing more of consultation, administration, 
and research in the future. Thus, it ap- 
pears that the total group of respondents 
anticipates doing more of what it is pres- 
ently doing. One noteworthy exception to 
this generalization is the comparatively low 
percentage of respondents who anticipate 
doing more diagnostic activities in the fu- 
ture. 

Degree. When the replies from the re- 
spondents with doctorate degrees were com- 
pared to the replies from respondents with- 
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out doctorate degrees very few differences 
were noted. It can be seen from Table 24 
that these groups differed on only four 
activities, and the levels of significance of 
three of these differences were only at the 
.10 level of significance. Thus, it can be 
said only with caution that more of the doc- 
torates anticipate doing more teaching than 
the nondoctorates, and more of the nondoc- 
torates anticipate doing more community 
education, diagnosis, and in-service educa- 
tion than the doctorates. 


Place of Employment. When the re- 
spondents were placed into subgroups ac- 
cording to their place of employment, and 
their replies compared, it was found that 
there were significant differences between 
these subgroups (see Table 24). The high- 
lights of these differences are as follows: 
(a) more of those employed in government 
settings anticipate doing more consultation 
than those employed in other settings, and 
less of those in departments of psychology 
anticipate doing more consultation than 
those in other settings; (b) more of those 
employed in academic settings anticipate 
doing more research and more teaching than 
those employed in other settings; (c) more 
of those employed in government settings 
anticipate doing more administration in the 
future than those employed in other set- 
tings; (d) more of those employed in 
School settings anticipate doing more in- 
service education than those employed in 
other settings, and less of those employed 
in academic settings anticipate doing more 
in-service education than those employed in 
other settings; (e) more of those employed 
in school settings and in hospitals or clinics 
anticipate doing more therapy than those 
employed in other settings and less of those 
employed in government settings anticipate 
doing more therapy than those employed in 
other settings; and (f) more of those em- 
ployed in schools anticipate doing more di- 
agnosis than those employed in other set- 
tings. 

Main Activities. When the respondents 
were placed into subgroups according to 
their main activities, and their replies com- 
pared, it was found that there were signifi- 
cant differences in their replies (see Table 


24). The same generalization can be ap- 
plied to these results as was applied to the 
results obtained from the total group, i.e. 
the respondents anticipate doing more in 
the future of what they are presently do- 
ing. It can be seen from Table 24 that 
compared to those mainly engaged in other 
activities: (a) more of those engaged in re- 
search and/or teaching anticipate doing 
more research and teaching in the future, 
(b) more of those mainly engaged in ther- 
apy and/or diagnosis anticipate doing more 
therapy and diagnosis in the future, and, 
(c) more of those mainly engaged in ad- 
ministration and/or consultation anticipate 
doing more administration and consultation 
in the future. In addition to this relation- 
ship between present and anticipated future 
activities, it was found that more of those 
mainly engaged in therapy and/or diagnosis 
anticipate doing more community education 
than those mainly engaged in other activ- 
ities, and less of those mainly engaged in 
research and/or teaching anticipate doing 
more in-service education than those mainly 
engaged in other activities. 


Opinion on What Community Mental 
Health Is. When the respondents were 
placed into subgroups according to their 
opinions on what the term “community 
mental health" denotes, and their replies 
compared, very little, if any, differences 
were found between the subgroups (see 
Table 24). Only two of the eight compari- 
sons made for these subgroups were sig- 
nificant—one at the .05 level and one at the 
-10 level. Since there was no way to test 
the significance of the over-all differences 
between these subgroups, there is no basis 
for assuming that the obtained two signifi- 
cant differences were not due to chance. 
The safest conclusion to apply to these re- 
sults is that no relationship was found be- 
tween opinion on what the term “commu- 
nity mental health" denotes and anticipated 
future activities. 

Summary. As a group, more of the re- 
spondents are anticipating doing more con- 
sultation, administration, and research in 
comparison to those anticipating doing 
more of other activities. Tt should be re- 
membered that the data presented in Table 
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24 are only the percentages of respondents 
who ranked each activity either 1, 2, or 3. 
The percentages for all rankings are much 
higher for each activity. Thus, the per- 
centage of respondents who included “con- 
sultation” in their choice of future activities 
(no matter what rank assigned to this 
activity) was 75 in comparison to the 57% 
(Table 24) who ranked this activity either 
1, 2, or 3. For research, the percentage in- 
cluding this activity among their rankings 
was 72; and for administration, the per- 
centage was 63. As a summary statement, 
therefore, it can be said that sizable per- 
centages of the respondents are anticipating 
doing more consultation, administration, 
and research, 

It was also found in this section that 
anticipated future activities are related to 
present activities and to place of employ- 
ment. Those mainly engaged in research 
anticipate doing even more research in the 
future, those mainly engaged in therapy 
anticipate doing even more therapy in the 
future, and so on. As an over-all general 
summary on the relationship between place 
of employment and anticipated future activ- 
ities, it can be said that in comparison to 
those in other settings more of those in 
academic settings anticipate doing more re- 
search and teaching in the future, more of 
those in government settings anticipate do- 
ing more consultation and administration 
in the future, and more of those in hos- 
pitals, clinics, or schools anticipate doing 
more therapy and diagnosis. However, it 
should be noted that except for those em- 
ployed in academic settings, the largest per- 
centages of all subgroups anticipate doing 
more consultation than any other single 
activity. 


Desired Content Areas 


The respondents were requested to rank 
order the content areas that they felt would 
have offered the best education and train- 
ing in preparing them for their present ac- 
tivities. The replies to this question were 
tabulated in two ways: (a) the percentage 
ОЁ respondents who included any given con- 
tent area in their rankings—no matter what 


rank assigned to the given content area; 
and (b) the percentage of respondents who 
assigned any given content area the rank 
of either 1, 2, or 3. The results of the first 
tabulations are not presented here in the 
interest of space, but reference may be 
made to these tabulations when appropriate. 
The results of the second set of tabulations 
are presented in Table 25. 

Total Group. Almost half of the re- 
spondents included consultation among 
their first three choices of desirable content 
areas (see Table 25). Since a large per- 
centage of these respondents are presently 
engaged in consultation activities, and a 
large percentage anticipate doing even more 
consulting in the future, it is not surprising 
to find that training in consultation would 
be high on their list of desirable content 
areas for training programs. Education 
and training in the areas of allied social 
sciences and community organization also 
were among the most frequently chosen 
desirable content areas. This undoubtedly 
is a reflection of the interest this group has 
in community mental health in comparison 
to other areas of the total field of mental 
health, 

Degree. It can be seen from Table 25 
that there were no significant differences 
worthy of noting between those respond- 
ents with a doctorate degree and those with- 
out a doctorate degree. 

Place of Employment. When the re- 
spondents were placed into subgroups ac- 
cording to their place of employment, and 
their replies compared, significant differ- 
ences were found between the subgroups’ 
choices of desired content areas. It can be 
seen from Table 25, that in comparison 
with those employed in other settings: (a) 
more of those employed in government set- 
tings chose consultation, community organi- 
zation, and administration as desirable con- 
tent areas for training programs; (b) more 
of those employed in academic settings 
chose allied social sciences as a desirable 
content area for training programs; (c) 
more of those employed in hospitals, clin- 
ics, or schools chose short-term therapy as 
a desirable content area for training pro- 
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grams; (d) more of those employed in gov- 
ernment settings, hospitals, or clinics chose 
public health principles as а desirable con- 
tent area for training programs; (e) more 
of those employed in academic departments 
other than psychology chose public health 
research methods (epidemiology, ecology, 
biostatistics) as a desirable content area 
for training programs; (f) less of those 
employed in academic departments other 
than psychology chose consultation as a de- 
sirable content area for training programs; 
(g) less of those employed in academic set- 
tings chose administration as a desirable 
content area for training programs; (Л) 
less of those employed in government set- 
tings, hospitals, or clinics chose action re- 
search as a desirable content area for train- 
ing; and (i) less of those employed in 
schools (and perhaps less of those employed 
in hospitals or clinics) chose public health 
research methods as a desirable content 
area for training programs. 


Main Activities. When the respondents 
were placed into subgroups according to 
their main activities, and their replies com- 
pared, significant differences were found 
between the subgroups' choices of desired 
content areas. It can be seen from Table 
25 that in comparison with those mainly en- 
gaged in other activities: (a) more of those 
mainly engaged in research and/or teaching 
chose allied social sciences, action research, 
and public health research methods as de- 
sirable content areas for training programs ; 
(b) more of those mainly engaged in ad- 
ministration and/or consultation chose 
administration as a desirable content area 
for training programs; (c) more of those 
mainly engaged in therapy and/or diagnosis 
chose short-term therapy as a desirable con- 
tent area for training; and (d) less of those 
mainly engaged in research and/or teaching 
chose consultation and community organiza- 
tion as desirable areas for training pro- 
grams. 

Opinion on What Community Mental 
Health Is. When the respondents were 
Placed into subgroups according to their 
opinions on what the term “community 
mental health” denotes, and their replies 


compared, some significant differences were 
found between the subgroups’ choices of 
desired content areas. It can be seen from 
Table 25 that in comparison with those who 
held other opinions: (a) more of those who 
believe that *community mental health" de- 
notes an attitude chose consultation and 
short-term therapy as desirable content 
areas for training programs, (b) more of 
those who believe that "community mental 
health" denotes a specialty chose public 
health principles and public health research 
methods as desirable content areas for 
training programs, and, (c) less of those 
who believe that "community mental 
health" denotes an attitude chose allied so- 
cial sciences as a desirable content area for 
training programs. 

Summary. It was found in the previous 
sections that, as a total group, the highest 
percentages of the respondents were pres- 
ently engaging in consultation and adminis- 
tration and were anticipating doing even 
more consultation and administration in the 
future, It is not surprising, therefore, to 
find in this section that the highest per- 
centage of respondents believe training in 
consultation would have been helpful for 
them, and that a high percentage believe 
that training in administration would also 
have been helpful. 

One noteworthy finding was that there 
were no significant differences between the 
doctorates and nondoctorates in their 
choices of desirable content areas. A sec- 
ond noteworthy finding is that education 
and training in public health is more often 
advocated by those who believe community 
mental health is a specialty area than by those 
who believe otherwise. And a third note- 
worthy finding is that education and train- 
ing in allied social sciences is more often 
advocated by those mainly engaged in re- 
search and/or teaching and those employed 
in academic departments than by those em- 
ployed elsewhere and primarily engaged in 
other activities. Finally, it should be noted 
that except for those employed in academic 
departments other than psychology, the 
highest or next highest percentage of each 
subgroup chose consultation as a desired 
content area for training. 
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grams; (d) more of those employed in gov- 
ernment settings, hospitals, or clinies chose 
public health principles as a desirable con- 
tent area for training programs; (e) more 
of those employed in academic departments 
other than psychology chose public health 
research methods (epidemiology, ecology, 
biostatistics) as a desirable content area 
for training programs; (f) less of those 
employed in academic departments other 
than psychology chose consultation as a de- 
sirable content area for training programs; 
(g) less of those employed in academic set- 
tings chose administration as a desirable 
content area for training programs; (A) 
less of those employed in government set- 
tings, hospitals, or clinics chose action re- 
search as a desirable content area for train- 
ing; and (i) less of those employed in 
schools (and perhaps less of those employed 
in hospitals or clinics) chose public health 
research methods as a desirable content 
area for training programs. 


Main Activities, When the respondents 
were placed into subgroups according to 
their main activities, and their replies com- 
pared, significant differences were found 
between the subgroups’ choices of desired 
content areas. It can be seen from Table 
25 that in comparison with those mainly en- 
gaged in other activities: (а) more of those 
mainly engaged in research and/or teaching 
chose allied social sciences, action research, 
and public health research methods as de- 
sirable content areas for training programs ; 
(b) more of those mainly engaged in ad- 
ministration and/or consultation chose 
administration as a desirable content area 
for training programs; (с) more of those 
mainly engaged in therapy and/or diagnosis 
chose short-term therapy as a desirable con- 
tent area for training; and (d) less of those 
mainly engaged in research and/or teaching 
chose consultation and community organiza- 
tion as desirable areas for training pro- 
grams. 

Opinion on What Community Mental 
Health Is, When the respondents were 
placed into subgroups according to their 
opinions on what the term “community 
Mental health” denotes, and their replies 


compared, some significant differences were 
found between the subgroups’ choices of 
desired content areas. It can be seen from 
Table 25 that in comparison with those who 
held other opinions: (а) more of those who 
believe that “community mental health” de- 
notes an attitude chose consultation and 
short-term therapy as desirable content 
areas for training programs, (b) more of 
those who believe that "community mental 
health" denotes a specialty chose public 
health principles and public health research 
methods as desirable content areas for 
training programs, and, (c) less of those 
who believe that "community mental 
health” denotes an attitude chose allied so- 
cial sciences as a desirable content area for 
training programs. 

Summary. It was found in the previous 
sections that, as a total group, the highest 
percentages of the respondents were pres- 
ently engaging in consultation and adminis- 
tration and were anticipating doing even 
more consultation and administration in the 
future. It is not surprising, therefore, to 
find in this section that the highest per- 
centage of respondents believe training in 
consultation would have been helpful for 
them, and that a high percentage believe 
that training in administration would also 
have been helpful. 

One noteworthy finding was that there 
were no significant differences between the 
doctorates and nondoctorates in their 
choices of desirable content areas, A sec- 
ond noteworthy finding is that education 
and training in public health is more often 
advocated by those who believe community 
mental health is a specialty area than by those 
who believe otherwise. And a third note- 
worthy finding is that education and train- 
ing in allied social sciences is more often 
advocated by those mainly engaged in re- 
search and/or teaching and those employed 
in academic departments than by those em- 
ployed elsewhere and primarily engaged in 
other activities. Finally, it should be noted 
that except for those employed in academic 
departments other than psychology, the 
highest or next highest percentage of each 
subgroup chose consultation as a desired 
content area for training. 
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As an over-all general summary state- 
ment for the findings in this section, it can 
be said that the desired content areas for 
training were often directly linked to the 
respondents' present activities (e.g., those 
engaging in therapy desired training in 
short-term therapy), but there was also a 
tendency for the respondents to indicate 
desire for education and training in com- 
munity oriented content areas, viz., allied 
Social sciences, community organization, 
and public health principles and research 
methods. 


Level for Education and Training 


After the respondents had been asked to 
indicate what content areas they felt would 
have been helpful in preparing them for 


their present activities, they were asked to 
indicate the level at which education and 
training in these areas should be made gen- 
erally available to psychologists. Their re- 
plies to this latter question were tabulated 
and statistically analyzed, and the results 
are presented in Table 26. 


Total Group. Approximately half of the 
respondents felt that education and train- 
ing of this type could be offered at the pre- . 
doctoral level, and approximately one-third 
of the respondents felt that this type of 
education and training should be at the 
postdoctoral level. Those who believe that 
this education and training should be of- 
fered at the postdoctoral level were evenly 
split between those who thought the train- 
ing should be offered immediately after the 
PhD is received and those who thought the 


TABLE 26 


REPLIES TO QUESTION: 


“Ат WHAT LEVEL SHOULD TRAINING IN Community MENTAL HEALTH BE OFFERED?” 


Immediate | Postdoctoral 
Variable N Pre- Post- after Other No 
doctoral doctoral Experience Answer 

Total Subjects 354 52% 17% 17% 12% 2% 
Degree: 

Doctorates 279 48 19 17 13 2 

Nondoctorates 71 68 6 14 10 3 
Place of Employment: 

Department of Psychology H 45 14 23 12 4 | 

Other Academic Department 48 50 19 10 14 6 

Government Agency 49 37 16 31 16 0 

Hospital or Clinic 105 48 24 15 13 0 

School 52 67 10 10 8 6 

Other 51 65 10 14 12 0 
Main Activities: 

Varied 114 50 17 15 16 2 

Research and/or Teaching 78 54 15 14 11 5 

Therapy and/or Diagnosis 85 56 18 18 7 1 

Administration and/or Consultation | 68 44 16 25 12 1 
Opinion on what the term “СМН” 

denotes: 

Attitude 124 52 20 16 10 1 

Interest 101 49 17 19 13 2 

Specialty 65 51 18 20 9 1 


Note.—For results of tests of significance, see Appendix B. 
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training should be offered after a few 
years of postdoctoral experience. 
Approximately 1 out of 10 of the re- 
spondents felt he could not specify one 
level for the training. The most frequent 
reason given for this was that "the training 
level would depend on the particular con- 
tent area to be taught." It should be noted 
that many of those who did specify one 
level for the training also commented to the 
effect that different content areas should be 
offered at different levels of training. 


Degree. When the replies from doctor- 
ates and nondoctorátes were compared, it 
was found that the over-all differences in 
these replies were significant at the .01 lev- 
el. From Table 26, it can be seen that in 
comparing these two subgroups more of 
the nondoctorates favor the predoctoral 
level for this type of education and train- 
ing, and more of the doctorates favor the 
postdoctoral level. 


Place of Employment. Yt can be seen 
from Table 26 that the over-all differences 
in the replies received from respondents 
employed in different settings were signifi- 
cant at the .05 level. The main differences 
between these subgroups appear to be the 
following: (а) more of those employed in 
government settings favor the postdoctoral 
level over the predoctoral level in compari- 
son to the other subgroups, and (b) more 
of those employed in school settings favor 
he predoctoral level over the postdoctoral 
evel in comparison with other groups. The 
preference of those employed in school set- 
tings for the predoctbral level may be re- 
ated to the fact that the proportion of non- 
doctorates employed in school settings was 
much higher than the proportion of non- 
doctorates in other settings—and it was 
ound that the nondoctorate respondents 
tended to favor the predoctoral level over 
he postdoctoral level (see Degree subsec- 
lion above). Of course this double relation- 
Ship leaves unanswered the question of 
Whether it is the fact of being employed in 


а school setting or whether it is the fact 


9f not having a doctorate degree that is the 
Strongest influence in favoring the predoc- 
toral level over the postdoctoral level. 


Main Activities and Opinion on What 
Community Mental Health Is. The over-all 
differences in the replies received from re- 
spondents mainly engaged in different activ- 
ities and from respondents having different 
opinions on what the term “community 
mental health” denotes were not significant 
(see Table 26). 


Summary. The over-all generalization of 
the findings in this section can be stated as 
follows: out of every 10 respondents, about 
5 believed that education and training in 
the content areas they thought desirable 
should be offered at the predoctoral level, 
about 3.5 believed that it should be at the 
postdoctoral level, and 1.5 believed that the 
training should be offered at various levels 
depending on the content areas. The nota- 
ble exceptions to this generalization are due 
to the subgroup employed in government 
settings which tends to favor the postdoc- 
toral level over the predoctoral level, and 
the subgroup employed in schools and the 
subgroup of nondoctorates both of which 
tend to favor the predoctoral level over the 
postdoctoral level. 


Locus for Predoctoral Training 


The respondents were asked where they 
thought predoctoral training in the content 
areas they chose as desirable should be 
primarily offered. The emphasis was placed 
on the word “primarily” because it was 
realized that education and training in some 
content areas may necessarily have to be 
taught in specific departments, and what 
was wanted in answer to the question was 
the respondents’ opinions on where they 
thought responsibility for the major portion 
of the training should lie. The respond- 
ents’ replies to this question were tabulated 
and statistically analyzed, and the results 
are presented in Table 27. 


Total Group. It can be seen from Table 
27 that for the total group of respondents 
there was a much larger percentage who 
favored interdepartmental programs to pro- 
vide predoctoral training for this area than 
the percentage who favored departments 
of psychology to provide this training. 
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TABLE 27 


REPLIES TO QUESTION: 


UWVHERE SHOULD PREDOCTORAL TRAINING OF Тніѕ ТҮРЕ BE PRIMARILY OFFERED?” 


Depart- Inter- Combi- 
ment of | depart- nation 
Variable N Psy- mental of Other No 
chology | Program | 1 and 2 Answer 
(1) (2) 
Total Subjects 354 21% 56% 9% 6% 1% 
Degree: 
Doctorates 279 21 54 10 7 7 
Nondoctorates 7 17 65 6 6 7 
Place of Employment: 
Department of Psychology 44 29 34 16 10 9 
Other Academic Department 48 10 62 12 8 6 
Government Agency 49 16 61 10 10 2 
Hospital or Clinic 105 24 54 8 5 9 
School 52 21 54 11 2 11 
Other 51 20 7 0 6 4 
Main Activities: 
Varied 114 22 60 6 6 7 
Research and/or Teaching 78 11 56 13 11 8 
"Therapy and/or Diagnosis 85 29 54 8 1 7 
Administration and/or Consultation 68 18 56 12 8 6 
Opinion on what the term “CMH” 
denotes: 
Attitude 124 28 52 9 6 6 
Interest 101 20 58 7 5 10 
Specialty 65 9 75 8 5 1 


Note.—For results of tests of significance, see Appendix В. 


Nine percent of the respondents indicated 
that this type of predoctoral training should 
be primarily offered by a combination of 
departments of psychology and interde- 
partmental programs. It is uncertain how 
the opinions of this 9% differ from the 
opinions of the group who believe the pre- 
doctoral training should be offered prima- 
riy within interdepartmental programs— 
since undoubtedly many, if not all, of the 
latter group would include departments of 
psychology in the interdepartmental pro- 
grams. 

Degree. There were no over-all signifi- 


cant differences between the replies re- 
ceived from doctorates and the replies re- 


ceived from nondoctorates in answer to this 
question (see Table 27). 


Place of Employment. The over-all dif- 
ferences in the replies received from re 
spondents employed in different settings, in 
answer to this question, were significant at j| 
the .10 level (see Table 27). The most 
obvious differences in the replies are those | 
received from the subgroup employed in 
departments of psychology and those re 
ceived from the subgroup employed in aca- 
demic departments other than psychology 
In comparison with the other subgroups, 
less of those employed in departments of | 
psychology felt that predoctoral training 
in the desired content areas should be prt- 
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marily offered within interdepartmental pro- 
grams, and less of those employed in aca- 
demic departments other than psychology 
felt that this training should be offered pri- 
marily within departments of psychology. 

Main Activities. The over-all differences 
in the replies received from respondents 
mainly engaged in different activities, in 
answer to this question, were significant at 
the .01 level. It can be seen from Table 27 
that, in comparison to those mainly en- 
gaged in other activities, more of those 
engaged in therapy and/or diagnosis felt 
that departments of psychology should be 
the primary locus for predoctoral training 
in the desired content areas, and less of 
those engaged in therapy and/or diagnosis 
felt that the locus for this training should 
be somewhere other than in either depart- 
ments of psychology or interdepartmental 
programs. 

Summary. Tt would be fair to summarize 
the findings in this section by stating that 
a majority of the respondents favor inter- 
departmental programs to provide predoc- 
toral education and training in the content 
areas named in a previous section (see 
Desired Content Areas), and a minority 
favor departments of psychology to provide 
this education and training. This general 
statement holds true despite the fact that 
the ratio of those who favor interdepart- 
mental programs to those who favor de- 
partments of psychology may differ for 
particular subgroups among the total re- 
spondents. 


Locus for Postdoctoral Training 


The respondents were asked where they 
thought postdoctoral training in the content 
areas they thought desirable should be pri- 
marily offered. The replies received were 
tabulated and analyzed, and the results are 
Presented in Table 28. 

Total Group. It can be seen from Table 
28 that only a negligible percentage of the 
Tespondents felt that postdoctoral training 
in the aforementioned areas should be pri- 
marily offered within departments of psy- 
chology or within schools of public health. 


- The largest percentage of the respondents 


felt that the locus for this postdoctoral 
training should be within interdepartmental 
programs, with a somewhat smaller per- 
centage believing that the locus should be 
within ongoing community mental health 
programs, and a still smaller percentage 
favoring a combination of the two to pro- 
vide the postdoctoral training. 


Degree. When the replies received from 
doctorates were compared with the replies 
received from nondoctorates, the over-all 
differences between the two subgroups' re- 
plies were found to be significant at the .01 
level. In looking at the results presented in 
Table 28, however, there is little about the 
differences between these two subgroups 
that appears noteworthy. Both groups ap- 
pear to follow the same trend of favoring 
interdepartmental programs, ongoing com- 
munity mental health programs, or a com- 
bination of the two to provide the locus 
for postdoctoral training. The differences 
in their replies appear to be minor in com- 
parison with the similarities. 

Place of Employment. It was not possi- 
ble to test the significance of the over-all 
differences in the replies received from sub- 
groups employed in different settings. The 
number of employment settings (six) com- 
bined with the number of classifications 
used in coding the replies (seven) and the 
distribution of replies made many of the 
expected frequencies too small to use in a 
x? analysis. 

Main Activities. The over-all differences 
in the replies received from subgoups main- 
ly engaged in differing activities were sig- 
nificant at the .10 level. In looking at the 
results presented in Table 28, however, 
there appears to be little about the differ- 
ences among these subgroups that appears 
to be noteworthy. 

Opinions on What Community Mental 
Health Is. The over-all differences in the 
replies received from subgroups based on 
opinions concerning what the term “сот- 
munity mental health" denotes were signifi- 
cant at the .10 level. It can be seen from 
Table 28 that in comparison with the other 
subgroups more of those who believe “com- 
munity mental health” is an area of interest 
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TABLE 28 


REPLIES TO QUESTION: 


"AERE SHOULD POSTDOCTORAL TRAINING OF Tuis Type BE PRIMARILY OFFERED?” 


Inter- | On- Depart- 
depart- | going | Combi- ments | Schools 
Variable N | mental | СМН | nation | Work-| of of Other} No 
Pro- Pro- of shops | Psy- Public Answer 
gram |gram | 1 and 2 chology | Health 
"Total Subjects 354 30% 19% 12% 1% 3% 3% 15% 10% 
Degree: 
Doctorates 279 29 21 10 6 4 4 17 9 
Nondoctorates 71 35 10 21 8 0 0 10 15 
Place of Employment: 
Department of Psychology | 44 27 16 18 7 4 7 11 и 
Other Academic Depart- 
ment 48 33 10 10 4 2 6 25 8 
Government Agency 49 26 31 10 10 2 4 14 2 
Hospital or Clinic 105 36 23 5 6 4 3 14 9 
School 52 29 8 12 10 4 0 14 25 
Other 51 23 23 24 6 2 2 16 4 
Main Activities: 
Varied 114 30 19 17 7 3 1 13 10 
Research and/or Teaching 78 32 17 7 3 3 9 20 10 
Therapy and/or Diagnosis 85 29 20 14 8 5 2 11 11 
Administration and/or 
Consultation 68 31 19 4 9 1 3 21 10 
Opinion on what the term 
“CMH” denotes: 
Attitude 124 26 24 14 10 2 2 13 10 
Interest 101 41 12 8 8 4 2 АЛ, 9° 
Specialty 65| 32 23 10 1 3 6 20 3 


Note.—For results of tests of significance, see Appendix B. 


felt that the locus for postdoctoral training 
programs should be in interdepartmental 
programs, and less of this subgroup felt 
that the locus should be in ongoing com- 
munity mental health programs. 


Summary. As an over-all generalization 
of the findings in this section, it may be 
said that the respondents favor either inter- 
departmental programs or ongoing commu- 
nity mental health programs, or a combina- 
tion of both, to provide postdoctoral 
education and training in the content areas 
named in a previous section (see Desired 
Content Areas). Some minor differences of 
opinion were found in various subgroups, 


but these differences were not great enough 
to contradict the generalization. 


Influences in Development of Interest 
in Community Mental Health 


The respondents were asked to rank 
order the influences that were important in 
their development of interest in community 
mental health. Some respondents did not 
rank order their choices, but instead, use 
check marks to indicate their choices. 
examination of the replies revealed that 
those who used check marks generally made 
no more than two choices; therefore, it was 
decided that by basing the tabulations of 
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the replies on the total number of times 
each influence was ranked either 1 or 2, 
the replies from most of the respondents 
who did not rank their choices could be 
included in the tabulations and not be dis- 
carded. The results of these tabulations 
are presented in Table 29. 


Total Group. It can be seen from Table 
29 that the largest percentage of the re- 
spondents indicated that job duties were 
among the chief influences in their devel- 
opment of community mental health inter- 
ests. Approximately 2 out of 3 of the 
respondents ranked this infiuence either 1 
or 2. The rather surprising facet of the 
results presented in Table 29 is the rela- 
tively low percentage of the respondents 
who considered that either university staff 
or internship supervisors were important 
influences in their development of commu- 
nity mental health interests. 


Degree. No significant differences were 
found between the replies received from 
respondents with doctorate degrees and the 
replies received from respondents without 
doctorate degrees (see Table 29). 


Place of Employment. When a compari- 
son was made of the replies received from 
the subgroups based on place of employ- 
ment, it was found that these replies dif- 
fered but slightly (see Table 29). Only 
three of the seven differences tested reached 
acceptable significance levels, so interpreta- 
tions of the obtained differences can only 
be made with caution. With this limitation 
in mind, then, it can be said that in compari- 
son with those employed in other settings: 
(a) there was a tendency for less of those 
employed in academic departments other 
than psychology to consider job duties as 
an important influence in their development 
of community mental health interests (al- 
though nearly one-half of this subgroup 
did consider job duties as an important 
influence), (b) there was a tendency for 
тоге of those employed in government set- 
tings and in hospitals or clinics to consider 
nonpsychology colleagues as important 
influences, and (c) there was a tendency 
for more of those employed in academic 
settings to list idiosyncratic influences 


(“other”) as important in their develop- 
ment of community mental health interests. 


Main Activities. It can be seen from 
Table 29 that there were some significant 
differences among the replies received from 
subgroups based on main activities. It ap- 
pears that in comparison with those mainly 
engaged in other activities: (a) less of 
those mainly engaged in research and/or 
teaching consider job duties or psychology 
colleagues to be important influences in 
their development of community mental 
health interests, and more of this subgroup 
consider idiosyncratic influences (“other”) 
as being important; (b) less of those main- 
ly engaged in administration and/or con- 
sultation consider the Zeitgeist to be an 
important influence; and (с) less of those 
engaged in therapy and/or diagnosis con- 
sider university staff to be an important in- 
fluence. 


Opinion on What Community М ental 
Health Is. When a comparison was made 
of the replies received from the subgroups 
based on opinions on what the term “сот- 
munity mental health” denotes, only two of 
seven differences tested were found to be 
significant (see Table 29). With caution, 
then, it can be said that in comparison with 
those who hold other opinions on what the 
term “community mental health" denotes: 
there is a tendency for more of those who 
believe that community mental health should 
be a specialty to attribute their interest in 
community mental health to a dissatisfac- 
tion with traditional duties, and there 15 a 
tendency for more of this subgroup to con- 
sider idiosyncratic influences (“other”) as 
being important. 

Summary. As a general summary state- 
ment, it can be said that the strongest influ- 
ences in the development of community 
mental health interests are job duties, and 
the weakest influences are educators and 
trainers. This conclusion is substantiated 
by the finding reported earlier (see Table 
1) that, on the average, the respondents 
did not develop interest in community men- 
tal health until almost a year after they had 
received their final degrees. 
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General Comments by Respondents 


In addition to answering the specific 
questions in the questionnaire, many re- 
spondents offered general comments about 
various aspects of the field of community 
mental health. In most cases, these com- 
ments were spirited, provocative, and ex- 
tremely informative. Unfortunately, the 
volume of these comments forbids their 
verbatim inclusion in this report. This is 
unfortunate because the tone, emphasis, 
and content of these comments give a rich 
and comprehensive "feel" of the problems 
and hopes of psychologists in community 
mental health. An initial attempt was made 
to “quantify” these comments so that they 
could be included here, but it became evi- 
dent that such quantification detracted much 
from the value of the spontaneous com- 
ments, and the quantification attempts were 
abandoned. In order that these comments 
may not be lost altogether, they have been 
mimeographed and are available for dis- 
tribution.® 

The reader of these comments will notice 
that there was a wide range of opinions 
among the respondents concerning the field 
of community mental health and the train- 
ing for this field. Some respondents believe 
that community mental health represents a 
totally new direction for the field of psy- 
chology while others believe that commu- 
nity mental health is merely a slight varia- 
tion of an old theme for psychologists. 
Some are quite vitriolic in criticizing oppo- 
sition to the establishment of a community 
mental health specialty within psychology, 
and others are just as vitriolic in criticiz- 
ing those who advocate the establishment 
of a new specialty. Some would like to 
have technique centered, training programs, 
and others would prefer to have broad, 
attitude oriented, training programs. Some 
feel that a clinical psychology background 
offers the best preparation for community 
mental health activities and others feel that 


5 Mimeographed copies of the verbatim general 
comments may be obtained by writing to: Chief 
Psychologist; Department of Psychiatry (Box C), 
Massachusetts General Hospital; Boston 14, Mas- 
Sachusetts, 


such а background is a definite handicap 
in community work. 

However, despite the divergencies, there 
do appear to be a few opinions held by the 
majority of the respondents. Most of the 
respondents seem to believe that community 
mental health (whatever it is) holds much 
potential for the maximum utilization of 
the particular skills and contributions of 
the psychologist. In this regard, they see 
community mental health as providing the 
vehicle through which the (clinical) psy- 
chologist can free himself of the quasi- 
medical role which has so hampered his 
development in the past. There also seems 
to be agreement that present training pro- 
grams are not providing adequate prepara- 
tion for this promising new area of activ- 
ity. Although there is no unanimity of 
opinion on what an adequate training pro- 
gram would be like, it is widely agreed that 
more training is needed in allied social sci- 
ences such as sociology and cultural anthro- 
pology. When speaking of education and 
training: The word "broadening" is used 
time and time again by these respondents. 
What the term seems to mean for most of 
those who use it is a greater awareness of 
the extrapsychic factors affecting human 
behavior, or greater concern with the total 
spectrum of human behavior (normal as 
well as abnormal), familiarity with a 
larger variety of methods of changing һи- 
man behavior, a greater emphasis on pre- 
ventive activities, and a working knowledge 
of the sociopolitical context in which psy- 
chologists must operate. 

The above two paragraphs contain the 
authors’ impressions of the respondents' 
views. The reader is strongly urged to 
read the respondents’ comments for him- 
self (see Footnote 5). 


DISCUSSION 


No attempt will be made here to discuss 
all the findings of this survey. Rather the 
authors would like to take this opportunity 
to share with the reader some of their im- 
pressions gained in carrying out the survey 
and in perusing the results. 
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Тһе опе overriding impression the authors 
have is that there is a great deal of interest 
in community mental health among psy- 
chologists throughout the country. This 
impression is based not only on the replies 
to the questionnaire, but also on correspond- 
ence carried on during the initial stages of 
the survey with key people in many federal 
and state agencies, regional associations, 
and committees of professional associa- 
tions. In almost every instance, requests 
by the authors for information or aid were 
responded to most generously with some 
expression of genuine interest in the prob- 
lem of preparing psychologists for commu- 
nity mental health activities. Coupled with 
this widespread interest, however, are four 
frequently asked questions which reflect the 
lack of basic data in this area. These ques- 
tions are: “What is community mental 
health?” “How many psychologists are in- 
volved in community mental health activ- 
ities?” “Should psychologists be involved 
in these activities?” and “What types of 
education and training would be appropri- 
ate for these activities?” After carrying out 
this survey, the authors would venture the 
following answers to these questions. 


What Is Community Mental Health? 


At the present time, there is no single 
answer to this question that would meet 
with the unanimous approval of all those 
who consider themselves engaged in this 
area. However, on the basis of the results 
of this survey, the following answer seems 
appropriate. 

Community mental health is an area of 
interest subsumed under the larger field of 
mental health. It is that part of the mental 
health field which is most concerned with 
the functioning of normal human beings in 
normal situations under normal circum- 
stances. "Normal" in this context includes 
the occasional difficulty in dealing with par- 
ticular problems, the occasional stressful 
situation, and the occasional disrupted cir- 
cumstances. It is a primary objective of 
those with community mental health inter- 
ests to learn what the relations are between 
individuals, situations, and circumstances. 
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This objective grows out of three related 
purposes: to further the understanding of 
human behavior; to prevent mental illness; 
and, to promote mental health, i.e., to maxi- 
mize the positive development of each indi- 
vidual's potentialities. 

Psychologists in community mental health 
pursue this objective by many different 
ways. There are those who are mainly en- 
gaged in research in this area. What dis- 
tinguishes this group from other psycho- 
logical researchers are the variables under 
study and the research methods used. The 
researcher in community mental health is 
more likely to be investigating the effects 
of such independent variables as families, 
neighborhoods, or social institutions on such 
dependent variables as adjusting to school, 
adjusting to a death in the family, or ad- 
justing to disasters. Due to the difficulty, 
or impossibility of manipulating such vari- 
ables, the researcher in community mental 
health is beginning to make more and more 
use of public health research methods. As 
the results of this survey indicate, the re- 
search oriented psychologist in community 
mental health is likely to consider commu- 
nity mental health to be an area of interest. 

Other psychologists in this area are main- 
ly engaged in traditional services such as 
therapy and diagnosis. The chief charac- 
teristics of this group are their emphases 
on situational therapy, the use of commu- 
nity resources in treating patients, the 
treatment of patients before their problems 
become seriously disabling, and the use of 
short-term therapy methods. In addition, 
this group is likely to be engaging in con- 
sultation with nonmental health personnel 
(e.g. teachers, ministers, nurses, etc.) 50 
that these personnel: may identify and refer 
potential mental health problems before 
they become serious, can more effectively 
collaborate in the treatment programs for 
some patients, and can become more aware 
of the mental health aspects of the pro- 
grams they institute or engage in. Many 


| 


| 
| 
| 


of this group of psychologists also devote | 


some time to community education efforts. 
It is probably because this group’s main 
interest is in therapy and diagnosis (which 
is shared by many other psychologists) 
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that a majority of them prefer to think of 
community mental health as a new attitude 
rather than as a new interest or a new spe- 
cialty. 

Still other psychologists in this area are 
mainly engaged in the administration of 
large-scale mental health programs (e.g., 
with state departments of mental health). 
In these positions, they tend to deal with 
agencies and community groups rather than 
with individual patients. Their goal is to 
maximize and coordinate all of the mental 
health resources and activities within their 
geographical areas. In pursuing this goal, 
they often become vitally concerned with 
legislative matters such as laws and appro- 
priations, and with many facets of commu- 
nity life that may have an influence on the 
mental health of the public such as the 
presence of high delinquency areas or the 
absence of recreational facilities for young- 
sters. In addition to purely administrative 
duties, these psychologists very often also 
engage in consultation and community edu- 
cation activities. The consultation of this 
group generally differs from the consulta- 
tion engaged in by: those interested in ther- 
apy and diagnosis in that the former is 
likely to be with administrative personnel 
concerning agency policies or programs and 
not concerning individual cases (for a more 
complete differentiation of “administrative” 
and “case” consultation, see Bindman, 
1959). The primary goals of the commu- 
nity education efforts of these two groups 
are also likely to differ with the administra- 
tors having the primary goal of creating 
a favorable public opinion for the passage 
of needed laws or the start of needed com- 
munity action rather than to educate the 
public in matters relating to mental health 
Or illness per se. Since most of the ad- 
Ministrators deal with mental health mat- 
ters on a community-wide basis and through 
the media of community agencies and 
groups, they are likely to be cognizant of 
the philosophy and principles of public 
health and to feel some relationship to the 
Public health movement. 

The above differentiation of various sub- 
groups of psychologists interested in com- 
munity mental health was done for purposes 


of exposition. It should be clearly under- 
stood that no such clear differentiation ex- 
ists in practice. Any particular psychologist 
in community mental health can be involved 
in any or all of the above mentioned activ- 
ities. 


How Many Psychologists Are Involved in 
Community Mental Health Activities? 


It is highly improbable that this question 
will ever be answered with a single figure. 
At the present time, it is not even possible 
to answer with a very gross figure because 
of the ambiguity in the term “community 
mental health” and because of the problems 
inherent in attempting to “count the heads” 
of psychologists in any area of activity. On 
the basis of this survey, however, what can 
be said is that as a very rock-bottom mini- 
mum estimate, at least 1,000 psychologists 
are considered to be involved in various 
aspects of community mental health. Even 
this very conservative figure, then, indi- 
cates that at least 1 out of every 17 psy- 
chologists presently has community mental 
health interests. There is every indication 
that this figure will steadily increase in the 
foreseeable future so that these psycholo- 
gists will form a considerable percentage 
of APA membership. 


Should Psychologists Be Involved in 
These Activities? 


This question may be valid, but it is ir- 
relevant on a pragmatic basis. The fact 
is that many psychologists are involved in 
these activities and the trend forecasts that 
more will be involved in the future. It was 
found in this survey that the chief factors 
influencing psychologists to become involved 
in this area are the demands of job duties. 
Thus, it seems that these psychologists did 
not first decide they should engage in these 
activities, but rather they accepted a posi- 
tion and found themselves becoming more 
and more involved in community mental 
health activities. The fact that these psy- 
chologists have continued on with these 


activities indicates that they must experi- 


ence a sense of professional adequacy and 
fulfillment in their new-found roles. The 
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authors find it difficult to believe that as 
many psychologists as there are would be 
involved in these activities if they felt other- 
wise. With a pragmatic criterion, therefore, 
the answer to the question is "Yes, psychol- 
ogists who are so inclined should be in- 
volved in these activities." With an esoteric 
criterion, the question will never be an- 
swerable. 


What Type of Education and Training 
Wowld Be Appropriate for These 
Activities? 


It should be clear from the results of this 
survey that there is no one way to educate 
and train psychologists for community men- 
tal health activities. The opinions of the 
respondents indicate that a variety of ex- 
perimental training programs at different 
levels of training are needed at the present 
time, However, despite the diversity of 
opinions concerning details, there are a few 
broad areas of agreement on this type of 
education and training that are worth not- 
ing. 

It was the general consensus of the re- 
spondents that the prime necessity at pres- 
ent is for students in psychology to be made 
more aware of the existence, challenge, and 
opportunities of the community mental 
health area. The respondents, reflecting 
their own experience, felt that students are 
not even aware that such an area of activ- 
ity exists until they have completed their 
education and are employed in the field. 
As a first step, then, there should be a 
greater exposure of students at all levels 
of training to the philosophy, goals, and 
techniques of the community mental health 
area. A frequently mentioned method of 
accomplishing this would entail the use of 
psychologists in community mental health 
to serve as guest speakers or seminar dis- 
cussants. 

There were also areas of agreement 
among the respondents concerning the edu- 
cation and training in specific content areas 
that might be introduced into the predoc- 
toral curriculum. There was high agree- 
ment that training in case consultation 
could begin at the predoctoral level and 


was very much needed. Considering that © 
8 to 9 out of every 10 respondents had some | 
consultation duties, it certainly seems that 
some training in consultation should be in- 
troduced into predoctoral programs. There | 
was also high agreement that more educa- |. 
tion and training in allied social sciences 
could become a part of predoctoral pro- 
grams. Perhaps the predoctoral student | 
who looks forward to entering the area of | 
community mental health should be en- | 
couraged to minor in one of the allied so- 
cial sciences such as sociology or cultural 
anthropology. There was agreement that 
training in long-term individual therapy 
should be de-emphasized (not omitted) and | 
more emphasis placed on short-term thera- 
peutic methods. Implicit in the latter state- 
ment is the understanding that short-term 
therapy involves assessing the reality factors 
of the patient’s social circumstances and 
making use of community resources in the | 
treatment process. Finally, there was also | 
agreement that training in research methods 
and designs should be broadened from the 
classical laboratory approaches to include 
some of the methods and designs developed 
within other social sciences and within the 
public health field. 

Education and training desired on the 
postdoctoral level largely involved an inten- 
sification of training in the areas mentioned 
above. In addition to the intensification, 
however, it was felt that training in admin- 
istrative consultation and in interdiscipli- 
nary research should be introduced at the 
postdoctoral level. There was agreement 
that training in administration and commt- 
nity organization would be most valuable} 
at this level. From the general comments 
of the respondents, it is apparent that ade- 
quate training in the latter two areas Te 
quires the opportunity for supervised 
experience rather than theoretical courses 
alone. Perhaps it was for this reason that 
many of the respondents indicated that 
postdoctoral training programs should be 
closely aligned with an ongoing community 
mental health program. 

There is much in the education and train- 
ing needs for psychologists interested im] 
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community mental health that may appear 


novel. However, many of these training 


needs are no more than a detailing of the 
following summary statement made in 1935 
by a special committee of the APA which 
was given the task of drawing up standards 
for the training of clinical psychologists 
(APA, 1935) : 


The art of dealing with human adjustment re- 
quires more than a knowledge of human behavior. 
Adjustment depends as frequently upon a manip- 
ulation of the factors without the individual 
as it does upon the analysis of those within, . . . 
[The clinical psychologist] should have an appreci- 
ation of the influence of community and family 
life upon the behavior of the individual. It is 


therefore necessary that he pursue courses in soci- 
ology and in social pathology (pp. 6-7). 

It should be noted that this view was not 
even novel in 1935, for in 1907 Lightner 
Witmer wrote: 

Although clinical psychology is closely related to 
medicine, it is quite as closely related to sociology 
and pedagogy. The school room, the juvenile court, 
and the streets are a larger laboratory of psy- 
chology. An abundance of material for scientific 
study fails to be utilized, because the interest of 
psychologists are elsewhere engaged ... (р. 7). 
The general theme of the replies and com- 
ments of the respondents in this survey 
could be interpreted as an “amen” to the 
above two quotations. 
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APPENDIX А 


Copy or QUESTIONNAIRE UsED IN SURVEY 
INSTRUCTIONS 


A sample range of answers has been provided for each question. These have been included 
only to indicate the type of answers we have in mind. They in no way were meant to be inclusive 
of all possible answers. Therefore, please feel free to answer each question im your own way, 
always attempting to be as specific as possible under the circumstances. 
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I. What are your present professional activities (including part time activities) ? 


1. [] Research. Hours per month spent in this аснпуйу_ — . — 
a. Type (eg. program assessment, operational, action, laboratory, clinical, etc.) 


b. Subjects (e.g., college students, adult hosp. patients, juvenile delinquents, work- 
ers in industry, families, etc.) 


c. Role in research (e.g., administrative only, collection of data, consultant in 
various phases, analysis of data, etc.) 


2. [] Teaching Psychology. Hours per month spent in this activity —______. 
a. Setting (e.g, Dept. of Psychology, School of Public Health, Dept. of Psy- 
chiatry, Institute, etc.) 


b. Course content (e.g., general, clinical, social, consultation, etc.) 


c. Level (e.g., undergraduate, graduate, postgraduate, etc.) 
3. [] Training Psychologists. Hours per month spent in this activity ^ 
b. Subject matter (e.g., therapy, diagnosis, consultation, community organization, 


try, community clinic, etc.) 


a. Setting (e.g., child guid. clinic, hosp. outpatient dept., personnel dept. in indus- 
administration, etc.) 


4. [] In-Service Training (excluding psychologists). Hours per month spent in this 


activity 


a. Setting (eg. child guid. clinic, hosp. outpatient dept., community clinic, state 
dept. of mental health, etc.) 


b. Audience (e.g., psychiatrists, social workers, teachers, nurses, ministers, prO" 
bation officers, etc.) 
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pus: L] Therapy. Hours per month spent in this activity 


a. Setting (e.g., guid. dept. in school, VA mental hyg. clinic, state hosp., private 
practice, etc.) 


b. Clients (e.g., grossly-disturbed children, delinquents, underachieving students, 
adult offenders, alcoholics, etc.) 


c. PEST (e.g, group, individual, long-term individual, short-term individual, 
etc. 


6. [] Diagnosis. Hours per month spent in this: activity 


es a. Setting (e.g. guid. dept. in school, VA mental hyg. uy state hosp., private 
^ practice, etc.) 


b. Purpose (e.g. aid treatment, research, selection, discharge planning, vocational 
guid., etc.) 


7. [] Administration. Hours per month spent in this activity 
a. Primary type of program (e.g., service, research, educational, etc.) 


b. Institutional setting (e.g. child guid. clinic, hosp. outpatient dept, state dept. 
of mental health, etc.) 


8. Г] Consultation. Hours per month spent in this activity 
a. To whom (e.g., schools, social agencies, correction agencies, health units, in- 
dustry, etc.) ? 


p: b. In what areas (e.g. program planning, treatment, research training, personnel 
x practices, diagnosis, etc.) ? 


. 9. П Community Education. Hours per month spent in this activity 
a. Method (e.g. discussion groups, lectures, sociodrama, ТУ, radio, newspaper 
articles, etc.) 


b. Primary audience (eg, parents, PTA groups, church groups, civic organiza- 
tions, community at large, etc. ) 


IIa. 


Пв. 
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10. [] Other activities. (Please specify and indicate number of hours per month.) 


A Reminder: The sample answers are meant ONLY AS EXAMPLES; therefore, please feel 
free to answer each question in your own way. 


If you could have had (or did have) specific education or training in any of the content | d 
areas listed below, which ones do you think would have (or have) best prepared you for 
your present professional activities? (Note: Please rank order your choices.) 


а. [] administration 


b. [J] community organization 
c. [] short-term (crisis) therapy 
d. [] consultation theory and techniques | 
е. [] public health principles and techniques |. 
f. [] action research : 
g. [Г] epidemiology, ecology, biostatistics, etc. l 
h. [] principles and concepts of allied social sciences 

other content areas (specify) : 
doeet 
iL 
k. [] 


If the opportunity for education or training in the specific content areas you have checked | 
above were to become generally available to psychologists, at what level should this op- 
portunity be offered? (It is realized that you may feel that some content areas may be | 
appropriate at one level while other content areas may be appropriate at another level | 
However, if you had to choose one level, which level would that be?) 1 


a. [] predoctoral 
b. [] postdoctoral right after PhD 
c. [Г] postdoctoral after a few years post-PhD experience 


d. [] other (specify) 


A Reminder: The sample answers are meant ONLY AS EXAMPLES; therefore, please feel. 
free to answer each question in your own way. 


IIc. 


TIp. 


ПІ. 


IVa, 


IVs, 
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Predoctoral training of this type for psychologists should be offered primarily 
a. [] within Departments of Psychology. 
b. [] within interdepartmental programs or institutes. 
c. [] other (specify) 


Postdoctoral training of this type for psychologists should be offered primarily 
a. [] within Departments of Psychology. 
b. [7] within interdepartmental programs or institutes. 
c. [] within Schools of Public Health. 
d. [Г] by means of workshops. 
e 0 

o 


within the framework of well-organized on-going community mental health 
programs. 


other (specify) 


On the basis of your present knowledge concerning the demands made upon your time in 
your professional duties and the development of your interests, what activities do you feel 
you will be doing more of in the foreseeable future? (Note: Please rank order your 


choices.) 


а. [] consultation g. [] diagnosis 
b. О administration h. [] teaching 
c. [Т] community education app] 
d. [Г] in-service education E EH 
е. [7] research k. [] 
f. [] therapy 1p 


A Reminder: The sample answers are meant ONLY AS EXAMPLES; therefore, please feel 


free to answer each question in your own way. 


t in community mental health developed to the 


In what year would you say your interes! p 
am interested in the area of community mental 


point where you would state explicitly "I 
health" ? 


а... T9 paf b. 0 not yet c. П never 

What were the strongest influences in the development of your interest in community 
mental health activities? (Note: Please rank order your choices.) 

a. [] university staff during my education 

b. supervisors during my internship 
specific colleagues or friends within psychology 
specific colleagues or friends outside psychology 
job duties which involved community mental health activities 
dissatisfaction with traditional activities (eg. testing, therapy, etc.) 


о ao 
[ЫЕ Г ГТ 


Zeitgeist 
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nO 
ТОШ | 
iL - 


V. Within the foreseeable future, "community mental health" for psychologists should 
primarily be: (Note: Make only one choice.) 


a. [Г] a specialty, comparable to clinical psychology, industrial psychology, etc. 

b. [] an attitude, comparable to the clinical attitude or public health attitude, etc. 
c. Ц an area of interest, comparable to juvenile delinquency, geriatrics, etc. 
а 


. 0 other (specify) 


A Reminder: The sample answers are meant ONLY AS EXAMPLES; therefore, please feel 
free to answer each question in your own way. 


VI We would like to include as broad a sample as possible in this questionnaire survey. How- 
ever, since there is no available list of psychologists interested in community mental 
health, we must rely on “word-of-mouth” procedures to learn who these psychologists are. 
You can be of great aid in this task by listing below the names of those psychologists who 
you know to be interested in, or engaged in, community mental health activities, and 
whom you feel that we might not have included in this survey. 


1, 6. 
2. Z5 + 
3 : 8. 
4 9. 
5 10. 


VII. General comments concerning any of the areas covered by this questionnaire or concern- 
ing the education and training of psychologists for community mental health activities. 
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Unless indicated otherwise, all tests of significance were made with chi square analyses. The degrees of 
freedom given in the column heads are for each figure in a given column except those figures that are followed 
by parentheses enclosing the number of degrees of freedom for those particular figures. 


Degree Place of Main Opinion 
Tables (df = 1) Employment Activities on CMH 
(df = 5) (df = 3) (df = 2) 
Y 
- Table 1: 
Age" 0.30 12.60** 2.08 5.56* 
Years Degree Held* 27.92**** 14.98** 3.00 1.70 
Years of CMH Interest* 1.70 24.12444* 16. 30**** 3.88 
Percent Doctorates no test 46.42 444% 13.24*«* 0.20 
Table 2: 
Geographical Distribution” 11.94 (8) no test 34.67* (24) 16.45 (16) 
Table 3: 
Place of Employment 46.42**** (5) no test 273.73**** (15) | 20.19** (10) 

| Table 4: 

Н Main Activities 13.24*** (3) | 273.73**** (15) no test 10.70* (6) 

Table 5: 

Opinion on CMH 0.20 (3) 23.30* (15) 19.05** (9) no test 
Table 24: 

Consultation 0.20 36.609» 21.164*** 4.89% 

\ Кезеагсһ 1.68 17.54*** 32.76**** 7.25** 
Administration 0.10 13.08** 24.39» 0.62 
In-Service Education 5.59** 18.33*** 12.63» 3.78 
Community Education 3.50* 8.61 8.71** 3.59 
Teaching 2.89* 17.97*** 16.14*** 2.86 
Therapy 0.85 17.97*** 34.85» 2.12 
Diagnosis 3.56* 9.12**%*** 13.32**» 0.21 

Table 25: 
Consultation 2.23 28.47 **** 26.59**** 21.20**** 

© Allied Social Sciences 0.00 17.26*** 16.21» 12.28 
Administration 0.00 21.555 33.25» 3.48 
Short-Term Therapy 5.79** 31.4455 31.228 7.92 
Community Organization 0.26 18.78*** 7.05* 3.76 
Action Research 0.15 13.30** 30.00**** o 

| Public Health Principles 0.64 20.38**** s 2 s 

l; Epidemiology and Ecology 4.05** 18.48*** 22.98 s 

С Clinical Courses 0.01 4.83 6.04 348 
АП Others 0.37 14.94** 1.11 1. 

|. Table 26: 

Level for CMH Training 11.35+** (3) | 24.6244 (05) | 8.09 (9) MSAN) 
Table 27: 
Lo 
Eu осие 3.04 (3) 24.10* (15) 17.73** (9) 11.22» (6) 
Table 28: 
Locus fi 
4 Trainee. са 19.34»«» (6) no test 27.24* (18) — | 19.27* (12) 
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APPENDIX B—Continued 


Degree Place of Main Opinion 
Tables (df — 1) Employment Activities on CMH 
(df = 5) (df = 3) (df = 2) 
Table 29; 
Job Duties 0.73 11.56** 8.63** 0.50 
Dissatisfied with 
Traditional Duties 0.61 1.89 5.76 7.57** 
Nonpsychology Colleagues 0.28 10.38* 4.43 5.83 
Psychology Colleagues 1.00 5.32 6.80* 4.25 
Zeitgeist 1.67 5.72 10.73** 2.63 
University Staff 1.81 no test 7.23% 2.12 
Intern Supervisors no test no test 8.26** no test 
Other 0.93 no test 11.75» 5.95»* 
a Median test of significance u; 
N Chi Чиге for distribution Ey APA members and survey subjects—44.89*"* (8), 
жж 


p < .05. 
e) < OL. 
mp < 001. 
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HELEN К. MARSHALL 
University of Kentucky 


B. report describes a combination of 
two types of investigation: (a) a 
i study of the relations between home experi- 
- ences and children’s social behavior in pre- 
school groups, and (b) an exploration of 
preschool children's use of language and 
- hostility to influence and adjust to age peers 
and teachers. 
- - The ideas for both studies were derived 
| from an investigation of preschool children's 
social behavior by Boyd R. McCandless and 
the present author (1957a, 1957b). Mar- 
shall and McCandless developed a method 
for recording social interactions among a 
- group of preschool children that yielded 
quantitative measures of classifications of 
social behavior of individual children. These 
observation measures had correlations of 
greater magnitude with tests and adult 
judgments of child behavior than were ob- 
tained between the two latter estimates of 
behavior. These measures appeared to be 
Suited for use as measures of children’s 
social behavior in an investigation of the 
Premise that home experiences are major 
determinants of the behavior of children 
away from home. The problem in planning 
this study was to find a basis for selection 
of home experience variables. 
_ This study was planned during the period 
in the mid-1950s when failure to find rela- 
tions was described in most reports of re- 
Search about parent-child relations. Al- 
though Baldwin and coworkers (Baldwin, 
1948, 1949; Baldwin, Kalhorn, & Breese, 
1945) had reported evidence of relations in 


= 

1 The investigation reported in this paper was 
Conducted in connection with a project of the Home 
Economics Department of the Kentucky Agricul- 
tural Experiment Station and is published by per- 
Mission of the station Director. 


the late 1940s, subsequent investigators, such 
as Sears, Whiting, Nowlis, and Sears 
(1953), Highberger (1955), and Burchinal, 
Hawkes, and Gardner (1957), were not so 
successful. Additionally, some of the parent 
measures that had related to child measures 
in the studies by Baldwin et al. did not 
relate to measures of children’s behavior in 
Highberger's study. Such research litera- 
ture, then, did not furnish an adequate basis 
for the selection of home experience vari- 
ables for the proposed study. Other investi- 
gators have used psychology theories as a 
solution to this dilemma, but the two major 
theories were not relevant sources of clues 
for this investigation. Psychoanalytic theory 
offers many hypotheses about home deter- 
minants of "undesirable" behavior in chil- 
dren, but has little to say about “socially 
desirable” behavior, a characteristic of most 
of the child observation measures developed 
by Marshall and McCandless. Derivation 
of hypotheses from learning theory depends 
on specific situations, which in this instance 
were unknown. It appeared, then, that se- 
lection of parent variables for th 
study could not be based on either evidence 
or theory. Fo 
The idea of exploratory study of social 
use of language was initiated by Marshall 
and McCandless’ findings. Two measures 
of social interaction developed in that in- 
vestigation were based on the frequency of 
the child’s talk with peers. These two meas- 
ures entered into larger correlations with 
estimates of social acceptance in the group 
than measures of social interaction that did 
not require the child to use language. This 
suggested that an exploration of children’s 


use of language with peers and teachers 


might indicate social behavior important for 


2 HELEN К. MARSHALL 


children's participation and acceptance in a 
preschool group. Perusal of research litera- 
ture indicated that few aspects of child 
development have been studied as inten- 
sively and by so many investigators as 
language development, but that most studies 
have been concerned with the development 
of vocabulary, sentence structure, gram- 
matical usage, and articulation. Investiga- 
tions of children's use of language to adapt 
to other environmental demands have been 
limited to (a) tests of Piaget's hypotheses 
that the child's response to the environment 
is egocentric (eg., Fisher, 1934; McCarthy, 
1930); and (b) studies of when and why 
children ask questions (e.g., Smith, 1933). 

In this investigation it was necessary to 
develop classifications and measures to ex- 
plore children's use of language. The as- 
sumption could not be made a priori that 
these measures and classifications were in- 
volved in influencing and adjusting to age 
peers and teachers. This claim had to be 
demonstrated through analyses of relations 
with measures of social interaction and so- 
cial acceptance obtained concurrently in the 
preschool group. Measures of social inter- 
action and social acceptance were proposed 
as child variables for the study of parent- 
child relations. Hence, the exploration of 
children's use of language and hostility to 
influence and adjust to peers required the 
child measures proposed for the study of 
parent-child relations. 

The inclusion of use of language among 
child social behavior variables was desirable 
for the study of parent-child relations for 
an equally cogent reason. When language 

use became part of the social behavior to 
be studied, ample evidence became available 
on which to base the selection of parent 
and home experience variables. Relations 
between home experiences and the well 
studied aspects of language development 
have been demonstrated in many investiga- 
tions conducted during almost all the years 
that child psychology has been a field of 
knowledge. Knowledge of these relations 
was extensive enough by 1940 to permit 
Dawe (1942) to conduct an experimental 
test of relations found in real life by other 
investigators. McCarthy (1954), in sum- 


marizing these investigations, includes de- | 
scription of relations for the home experi- 
ences of: socioeconomic status of the 
family; education of parents; living in a 
family vs. institutionalization ; travel; asso- | 
ciation with adults; and experience with 
words, books, pictures, and their real life 
counterparts. 

The two possible studies were combined 
to produce this investigation of the relation 
of home experiences to children's use of 
language and hostility in play interactions 
of preschool groups. This study had three 
initial purposes : 

1. To investigate preschool children's use 
of language and hostility to influence and 
to adjust to age peers 

2. To study the relation of home experi- 
ences to children's social behavior with age 
peers in preschool groups 

3. To investigate the relations between 
home experiences and children's dependence 
on teachers during child directed play in 
preschool groups 
The first analyses to be performed were 
concerned with reliability and age differ- 
ences for measures of use of language and 
hostility. The findings indicated that a 
dichotomy of classification for these new 
measures might consist of two opposite 
variables. Hence, a fourth purpose of this 
study became the delineation of these two 
variables: the use of dramatic play language 
and dramatic play hostility with peers, and 
the use of reality language and reality 
hostility in play with peers. 
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METHOD 
Subjects 


The subjects of this investigation were 108 chil- 
dren attending Kentucky preschools and both thé 
mothers and fathers of 101 of these children. The 
age and sex division of the children is shown Ш 
Table 1. Data were not collected from parents 0 
three girls and four boys in the 23-33 year group, 
because these children were added to the sample 
to enlarge that age group for analyses of agè 
differences. 

All children attended a morning preschool group 
of one of three schools : the Berea College Nursety 
School (one group), the Douglass Boulev: 
Christian Church Preschool in Louisville (three 
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TABLE 1 


NUMBER OF CHILDREN IN EACH AGE AND 


| Sex GROUP 
Age groups Girls Boys | Both sexes 
Г 014-315 years | 7 9 16 
36-16 years 17 18 35 
2 415-515 years 18 17 35 
514-614 years 7 15 22 
- All ages 49 59 108 


| 


| 

groups), and the University of Kentucky Home 
Economics Nursery School (four groups). The 
three schools were selected for their similarity in 
guidance policies, in buildings, equipment, and 
staff, and in the socioeconomic status of parents 
enrolling children. Guidance policies at all schools 
emphasized the importance of children directing 
their own play during most of the school day. All 
children had attended preschool for at least 6 weeks 
prior to observation, and most had attended pre- 
school for more than a half year. 

None of the families was below the professional- 
business managerial level in socioeconomic status, 
except the presently unclassifiable families of four 
college students. In other instances when the 
father’s occupation was below this level, by reasons 
of income and inherited social position, the families 
could be classified at the socioeconomic level above 
the professional level. Most fathers were engaged 
in the professions. There were more medical doc- 
\ fors, dentists, lawyers, artists, and musicians in 
this majority than there were college professors. 
| The mean education of fathers was a semester 
- more than a bachelor's degree, and the mean educa- 

tion of mothers was a semester less than a bache- 
Й lor's degree. This was true at all three preschools. 

Almost all families lived in houses they had built 

in preferred residential areas of Berea, Lexington, 

and Louisville. The child in the sample was the 
only child for 7 of the 101 families. Experience 
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A * The assistance and cooperation of these schools 
in permitting and facilitating data collection is 
‚ appreciated. The Directors of the three preschools 
: studied in the spring of 1957, who contributed 


much time to this investigation, were Virginia S. 
Chance of Louisville, Opal Wolford of Berea, and 
Billie К. Cope of Lexington. The following 
teachers of the preschool groups also gave assist- 

_ nce: at Berea College, Angli Wai; at the Louis- 
ville school, Lucille Filson, Chris Inman, Mary Lois 
Koenig, Rubye McDowell, and Donna Vick; at 
the University of Kentucky, Joann Atcher, Haze- 

deen Brewster, Rohini Doshi, Rachel C. Graves, 
Jean G. Hobart, and Hazel McCrary. 


with siblings of the subjects undoubtedly affected 
the experiences as parents and the attitudes of all 
other mothers and fathers. 

Data were collected in the spring of 1957 for all 
children 33 years of age or older, and their parents. 
Data were collected for only three children in the 
youngest age group at this time. In the 1957-58 
academic year, data were collected for six children 
aged 23 to 34 years who were enrolled in two 
groups of the University of Kentucky nursery 
school, and for their parents. Data were collected 
at this school in the fall of 1958 for the seven 
children of this age whose parents were not inter- 
viewed. АП data were collected by the author. 


Measures 


The measures of children's social behavior with 
peers in this investigation were of four kinds: 

1. Observation measures of the number of social 
interactions, developed by Marshall and McCand- 
less (1957b) 

2. Observation measures of the frequency of use 
of language and hostility, developed in this in- 
vestigation 

3. Measures of aggression, submission, and 
dominance obtained by summations of the use of 
language and hostility scores 

4. Scores on the picture sociometric test (Mc- 
Candless & Marshall, 19572), an estimate of social 
acceptance in the preschool group 

Two tests administered to children as controls 
for the new measures of use of language and 
hostility were the Vocabulary test of the Stan- 
ford-Binet Test of Intelligence (Terman & Merrill, 
1937), and the Lerner Blocking Technique No. 2 
(Lerner, 1956), as scored for aggression by Otis 
and McCandless (1955). 

There were two types of measures of children's 
dependence on teachers during preschool play : 

1. Observation measures of the number of social 
interactions with teachers (Marshall & McCandless, 
19572) 

2. Observation measures of the frequency of use 
of language and hostility between child and teach- 
ers, developed in this investigation 

Measures of home experiences were obtained 
from interview information and tests, and were 
the following : 

1. Measures of home experiences with the dra- 
matic play topics of the preschool group, developed 
in this investigation 

2. Time the child spent listening to stories and 
records and watching television 

3. Time spent in talk with the child by family 
members and maids 

4. Education of both parents 

5, Scores of both parents on 24 scales and 5 com- 
posite scales of the Parental Attitude Research 
Instrument (Schaefer & Bell, 1955, 1958) 
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A detailed description of these measures, in the 
order listed above, is presented after the general 
description of child observation procedures that 
follows. 

Each child was observed in child directed play 
in thé preschool group for a minimum of 100 min- 
utes, or for 50 2-minute time sample records. Child 
directed play was defined as follows: the children 
initiate and carry out all play ideas, activities, and 
solutions, and there is the barest minimum of social 
interaction with teachers, Observations were spaced 
over 3-5 weeks so that no more than 10 minutes 
of play per day was recorded for any child. 

The time sample record of behavior was an 
elaboration of the method for group observation 
of children's play developed by Marshall and Mc- 
Candless (1957b). In this method, the observer 
records the number and type of social interactions 
that occur in 2 minutes among the several children 
playing together or near each other at the begin- 
ning of the record and any other childen or adults 
who may approach these children. Children's names 
are written in diamonds printed on the record form. 
Each side of the diamond represents a behavior 
classification. Social interactions are recorded by 
lines drawn between the appropriate sides of the 
diamonds for two or more persons. 

Interactions between Children. In the Marshall 
and McCandless method, interactions between chil- 
dren are recorded in the following four classifi- 
cations : 

1, Association—when children seem aware of a 
common interest, activity, or goal (most dramatic 
play can be classified as association) 

2. Friendly approach or response—the use of 
neutral, pleasant, friendly, or helpful words to 
approach or respond to another person 

3. Conversation—the continuation of a friendly 
approach for at least a half minute of the 2-minute 
record 

4. Hostile—any approach or response that inter- 
feres with the activity of another, attacks another, 
or is a judged withdrawal from another person 
The measures obtained from the records are the 
mean number of children per 2-minute record with 
whom the child had that category of interaction. 
An overall measure of friendly behavior is the 
sum of the measures in the three friendly classi- 
fications, all friendly interactions. 

Use of Language and Hostility. Children’s use 
of language and hostility with peers and teachers 
was recorded at the same time as interactions with 
peers and teachers. The record consisted of sym- 
bols written in the child's diamond for any classi- 
fications of language and hostility that were used 
during the 2 minutes of observation. 

The classifications developed for this investiga- 
tion were based on two sources. The investigator's 
experience in collecting observation records of 
children's play and her experience as a nursery 
School teacher suggested that preschool children's 
use of language and hostility during dramatic play 


differed from their use of language and hostility $ 
when they talked as themselves and were concerned 
with reality. This distinction was developed into 
the dramatic play and reality dichotomy of this 
investigaton. The other categories were developed 
through trial-and-error observation of children's 
use of language during preschool play. The se- 
lected categories of suggestion, imitation, agree- 
ment, greeting, and question, could be discriminated 
in observing children's play, and appeared to in- 
clude all language use between children. 

The dramatic play and reality distinction for use 
of language and hostility was made at the time of 
recording of any and all other categories of use 
of language and hostility. When the child was 
engaged in dramatic or imaginative play and spoke 
in character, as if he were the person, animal, or 
thing he was attempting to represent, this behavior 
was classified as dramatic play or "role" use of 
language and hostility. Role use of language was 
recorded for the child by placing an “R” with the 
symbol for the other classification of that language 
use on the friendly side of the diamond represent- 
ing that child on the record form. When the child 
talked as himself, this behavior was classified as 
reality or "self" use of language. Self use of lan- 
guage was recorded for the child by placing an 
S with the symbol for the other classification of 
that language use on the friendly side of the 
diamond for that child. 

Use of any of the other five categories at least ' 
once during the 2 minutes sampled was recorded 
in the child's diamond by the capital of the initial 
letter of the category name with the appropriate 
symbol (R or S) for the dramatic play—reality 
distinction. Definitions of these categories are as 
follows : 

Suggestion (S) —the child uses language to sug- 
gest an idea or activity for self or others 

Imitation (I)—the child imitates the words or 
sounds originated by another child 

Agreement (A)—the child uses language to 
agree with or to follow the suggestion of another 
person 

Greeting (G)—the child greets, welcomes, ОГ 
pays attention to in a "just noticing" way (or 
returns such a greeting to) someone not present, 
not noticed, or not sharing play in the immediately 
preceding time 

Question (Q)—the child asks a question of 
another 

The use of language records could be and was 
tabulated separately for each of the three friendly 
interaction categories. Measures used in reporting 
results, that are not identified by the name of the | 
interaction category, are those tabulated for the 
friendly approach interactions. Measures of the 
six classifications of dramatic play and reality sug- 
gestion, imitation, and agreement are the percent- 
age of 50 or more observation records in which 
the behavior was recorded and a friendly approach 
interaction between children was recorded also. 
Measures of dramatic play and reality greeting and 
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question include behavior toward both peers and 
teachers. They are the percentage of 50 or more 
observation records in which the behavior was 
recorded between children or between teachers. 

Use of language during association and conver- 


- sation interactions is identified by the name of the 


^ 


interaction category. Association dramatic play and 
reality suggeston, imitation, and agreement meas- 
ures are the percentage of observation records in 
which both an association interaction and the use 
of language of the particular category were re- 
corded. Conversation dramatic play and reality 
suggestion, imitation, and agreement are the per- 
centage of observation records in which both a 
conversation interaction and the use of language 
of the particular category were recorded. 

The only elaboration of the hostile interaction 
category of the Marshall-McCandless method was 
a distinction between hostility in dramatic play and 
in reality play. Hostile language was not recorded 
in any other way. Children were judged to show 
dramatic play hostility when they shot others dead, 
when they said, "You Indians, get out of our fort!” 
or “Daddy, that’s not the way to feed the baby,” 
and whenever the expression of hostility carried 
out their role in dramatic play. The expression 
was classified as reality hostility when the child’s 
behavior was in his own behalf. For example, 
Mark showed reality hostility when he grabbed 
and shook the handlebars of the tricycle Judy rode. 
Judy's response, “This is my tricycle!” also showed 
reality hostility. Dramatic play and reality hostility 
were recorded by placing an R or an S, respec- 
tively, on the hostile side of the child's diamond. 
The measures of dramatic play and reality hos- 
tility are the percentage of observation records in 
which the behavior was recorded and a hostile 


_ interaction between children was recorded, also. 


Aggression, Submission, and Dominance. The 
measures of children's aggression, submission, and 
dominance were obtained by adding the appropriate 
use of language and hostility measures. The defini- 
tions of these measures by words and by the parts 
summed are as follows : 

Positive Aggression—the frequency of friendly 
suggestion, or the sum of the percentages for 
dramatic play suggestion and reality suggestion 

Submission—the frequency of imitation and 
agreement, or the sum of the percentages for 
dramatic play imitation and agreement and reality 
Imitation and agreement 

Aggression—the frequency of hostility and sug- 
gestion, or the sum of the percentages for dramatic 
Play hostility and suggestion and reality hostility 
and suggestion 

Dominance—the difference in frequency of ag- 
gression and submission, or the sum of the per- 
centages obtained for submission subtracted from 
the sum of the percentages obtained for aggression 
Reliability of Child Observation Measures. Re- 
liability between observers was established between 
the investigator and a Director of the University 
of Kentucky nursery school, Billie K. Cope, for 


the elaborated observation records. Agreement of 
92% of the entries was obtained for 10 consecutive 
records, after a practice period of about 50 records. 

Reliability over time for play observation meas- 
ures was determined twice for two consecutive 
2-week periods. Fifty or more 2-minute records 
were collected per child for the nursery school 
group of 11 children, aged 35 to 58 months, 
observed in 1957. Only 35 or more records were 
collected in each observation period for the nursery 
school group of 12 children of the same age range 
observed in 1958. Product-moment correlations 
between observation periods in both years are pre- 
sented in Table 2 for scores of use of lan- 
guage and hostility and of play interaction among 
children. 

Reliability over time characterized dramatic play 
use of language and hostility, but was not found 
for reality use of language and hostility, as is 
shown in Table 2. The r's listed for dramatic play 
use are large enough to indicate fairly high reli- 
ability between observation periods for both groups. 
The small r's listed for reality use indicate low 
reliability over time. The 1958 reliability observa- 
tions were conducted to check this finding of 
change with time in reality use of language and 
hostility for the 1957 group. 

Correlations listed for the play interaction scores 
are as high as those reported by Marshall and 
McCandless (1957b). 

Correlations between observation periods for the 
association and conversation use of language and 
hostility scores were computed only for the 1957 
group and are not listed in Table 2. Sizes of r's 


TABLE 2 


Propuct-MomENT CORRELATIONS OBTAINED FOR 
Two DIFFERENT PRESCHOOL GROUPS TO INDICATE 
THE RELIABILITY OF CHILD OBSERVATION MEASURES 
OVER THE TIME SPAN OF CONSECUTIVE Two-WEEK 
OBSERVATION PERIODS 


Child observation 1957 group | 1958 group 
measures (N = 11) | (N = 12) 
có IY cnt 
Dramatic play language: 
Suggestion .84 .96 
Imitation 49 11:90 
Agreement .81 .85 
Hostility 4 ‚716 
Reality language: { 
Suggestion .55 +04: 
Agreement AT +290}. 
Hostility 211 284 
Social interaction: 
Association .89 .85 
Friendly approach ‚174 .91 
Conversation 61 .85 
Hostile .86 .64 
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resembled those listed in Table 2 for the friendly 
approach measure in all instances. 

Marked change with time was indicated for 
between period correlations for reality greeting 
(.27) and reality question (.05) in the 1957 group. 
There were too few observed instances to deter- 
mine comparable reliability for dramatic play 
greeting and question. 

Social Acceptance. Social acceptance measures 
were scores on the picture sociometric test, de- 
scribed by McCandless and Marshall (1957a). This 
was the first of three tests administered individually 
to each child during the final week of play observa- 
tion in the preschool group. In this test, each child 
chooses at least three preferred playmates for each 
of three situations from individual photographs of 
all children in the group. The sociometric score 
is based on points assigned choices of all children 
in the group and is an estimate of the child's 
popularity within the group. 

Tests of Vocabulary and Aggression. The child's 
score on the Vocabulary test of Form L of the 
1937 Revision of the Stanford-Binet Test of Intel- 
ligence was expressed as vocabulary age in months. 
This test followed the sociometric test for all 
children. 

The test for aggression consisted of three units 
from the Lerner Blocking Technique No. 2 (Ler- 
ner, 1956), as adapted and scored for frequency 
of aggression and submission by Otis and Mc- 
Candless (1955). The child was given three trials 
of each of the structured doll-play situations of, 
in order of administration, "How can I pass?" 
"My doll stops your car,” and “Who gets there 
first?" The aggression score is the sum of degree 
of aggression points awarded to each response; 
high scores indicate more frequent and intense 
aggression responses. 

Submission scores were not used in analyses 
because only 62 of the child subjects of this inves- 
tigation made any responses that could be classified 
as submissive rather than as aggressive. The in- 
frequency of submission can be interpreted as 
indicating freedom from anxiety about aggression 
in this test situation, For several weeks the chil- 
dren had seen the investigator observe and do 
порше about the aggression occurring in their 
play. 

The test of aggression was the last of the three 
tests given to each child. Frustrations of this test 
were forgotten as the child selected а toy from 
an assortment of dime toys as a reward for 
"playing the games." 

Dependence on Teachers. The frequency of 
child-adult interactions in preschool play can be 
used as a measure of the child's dependence on 
adults, as has been described by Marshall and 
McCandless (1957a). Interactions between chil- 
dren and teachers were recorded in tlie same way 
as child-child interactions during the observations 
for this study. Friendly dependence measures are 
the mean number of all friendly interactions be- 
tween the child and teachers per observation record. 


Hostile dependence measures are the mean number 
of hostile interactions between the child and teach- 
ers per observation record. 

Use of language and hostility between children 
and teachers was recorded in the same way and 
for the same categories as use of language and 
hostility among children. When a friendly approach 
interaction line was drawn between the diamonds 
for the child and teacher, the use of language and 
hostility by the teacher during that record was 
tabulated as “from teacher to child,” and the use 
of language by the child during that record was 
tabulated as “from child to teacher.” Hence a dis- 
tinction of direction was added for analyses of use 
of language and hostility between children and 
teachers. Measures for dependence use of language 
and hostility are the percentage of observation 
records in which the behavior was recorded. A 
friendly approach or a hostile interaction line be- 
tween the child and teacher was recorded also. 

Reliability over time was determined for two 
measures of dependence use of language. In a 
group of 11 children, scores for reality suggestion 
from child to teacher obtained during 2 weeks of 
observation had an r of 69 with the scores for 
this kind of dependence obtained from the next 
2 weeks of observation. Scores for reality agree- 
ment by child with teacher from each time period 
had ап r of .60. Both r's indicate fairly high 
reliability over time for these measures, a char- 
acteristic not found for use of reality language 
with peers. 

Parent Interviews. Both parents of 101 child 
subjects were interviewed individually or together 
at the preschool, at the home of the parents, or, 
in a few instances, by telephone. Face-to-face 
interviews could not be arranged with 11 fathers, 
so their interviews were conducted by telephone. 
Parents were interviewed in the time period after 
two-thirds of the play observations in the preschool 
group had been completed, and within 2 weeks after 
the completion of the play observations. Inter- 
views required half an hour to 2 hours of time, 
depending on the parents’ desire to talk. Answers 
to interview questions usually required only 15 to 
20 minutes. The interview blank of one mother 
was lost in the travel transfers of data. 

Parents’ answers to interview questions were 
recorded as completely as possible. Answers known 
to be useful, such as time spent by the child at 
various activities, were recorded verbatim. Inter- 
view questions were mimeographed in the form 
presented in Appendix A. 

Home Experiences with Dramatic Play Topics. 
"The content of the children's dramatic play during 
each observed 2 minutes was described briefly on 
the record form. After at least two-thirds of the 
50 observation records for all children in the group 
had been completed, the dramatic play topics of all 
children were tabulated. The resulting list of the 
preschool group's dramatic play topics was the 
basis of a check list in the interviews for the 
parents of that group. For each dramatic play 
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“topic, the parent was asked to check home experi- 
ences that could have served as a source of infor- 
mation about the play topic for his child. The list 

“of topics for one preschool group is presented on 

“the check sheet in Appendix A. The list of dra- 

“matic play topics in all preschool groups observed 

in the spring of 1957 is presented in the first section 

describing results. 

-. Home experience scores derived from these check 
lists were the percentage of the number of dramatic 
play topics of the particular preschool group that 
were checked by either or both the father and 
mother. Replies were adequate for percentage 
scores of the eight home experiences that follow. 

1. Talk with father. The father had talked with 
the child about the topic (checked only by the 
father). 

2. Talk with mother, The mother had talked 
with the child about the topic (checked only by the 
mother). 

3. Books and stories, The child had picture 
books and/or had listened to stories about the 
topic. 

4. Story and music records, The child had music 
or story records about the topic. 

5, Television. The child was known to have 
watched television programs about the topic. 

6. Personal experience. The child had seen real 
counterparts of the persons, objects, and situations 
of the topic. 

© 7. Talk of other adults. The child was known 
to have talked about the topic with adults other 
than parents and preschool teachers, such as grand- 
parents and other relatives, the maid, and adult 
friends and visitors. 

8. Talk with children. The child was known to 
have talked about the topic with children other 
than those in the preschool group, such as siblings 
or neighborhood playmates. 

One source of information, movies, was not used 
às a home experience because parents checked that 
only 16 girls and 34 boys had attended movies con- 

_ cerned with dramatic play topics. The score for 
varied home experiences with the dramatic play 
topics was the percentage of play topics for which 

| Parents had checked four or more home experiences 
as providing information. It was titled “Four or 
more checked.” 

Time for Stories, Records, and Television. Re- 
ports. by parents of the time the child spent 
listening to stories and records and watching tele- 

. Mision furnished three measures : 

1: Minutes of story, expressed as daily minutes 

the child listened to a story 

2. Minutes of records, expressed as daily minutes 

the child listened to records. Most parents said 

their child had spurts of listening to records, rather 
than a regular daily pattern. This time was the 

Parents’ average of the spurts of listening time. 

i 3. Minutes of television, expressed as daily 
minutes the child paid attention to television 
Programs 


Time Talking with Family Members and Maids. 
Four estimates of the time spent talking with 
family members and the maid were derived from 
the parent interviews. 

1. Father's time talking, expressed as weekly 
hours of talking to and with the child. This is 
the time the father reported as spent in talk with 
the child, individually or jointly with other family 
members, in all daily and weekly activities. 

2. Mother's time talking, expressed as weekly 
hours of talking to and with the child. This is the 
time the mother reported as spent in talk with the 
child, individually or jointly with other family 
members, in all daily and weekly activities. 

3. Siblings’ time talking, expressed as daily hours 
of talking with the child. This is the time either 
or both parents reported as usually spent by the 
child in talk with older and/or younger siblings 
each day, at all activities. 

4. Maid’s time talking, expressed as weekly hours 
of talking with the child, This time includes that 
for baby sitters as well as the usual Negro maid 
of families of this economic level in Kentucky. 
This is the time either or both parents reported 
that the maid and/or baby sitter spent in talk with 
the child during a usual week. 

Education of Parents. The number of school 
years completed was the measure of the education 
of fathers and mothers. 

Parent Attitudes, Parents were asked to agree 
or disagree, strongly or mildly, with the items 
expressing “attitudes contrary to the usually ap- 
proved child-rearing opinions” of the Parental 
‘Attitude Research Instrument (PARI) developed 
by Schaefer and Bell (1958, p. 346). Sixty mothers 
completed the PARI at meetings of preschool 
mothers’ groups. All fathers and the remaining 
mothers answered the PARI at home, after instruc- 
tions to avoid discussion of items with their 
spouses until both had completed the PARI. One 
hundred mothers and 93 fathers completed the 
PARI, all but 12 doing so in the time interval 
between the initiation of preschool observations 
and the parent interview. The mother and eight 
fathers not completing the PARI were willing to 
be interviewed, but refused to do this test. All 
parents complained about the PARI during the 
parent interview. Complaints tended to emphasize 
the discourtesy of requesting individuals of their 
educational attainment to agree or disagree with 
the loosely worded sentences of faulty construction 
that they found in the test. 

The form of the PARI used in this study was 
that presented in the Towa Parent Practices Re- 
search Scales for Fathers and Mothers by Chan- 
tiny, Lovell, and McCandless (1956). This selection 
includes 24 five-item scales. For each item, four 
points were given for a parental check of strong 
agreement, three points for mild agreement, two 
points for mild disagreement, and one point for 
strong disagreement. The scores for each scale 
range from 5 to 20 points. 
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Five composite scales, described by Shaefer and 
Bell (1955) as five factors in the factorial struc- 
ture for the normative study, were used in addition 
to the 24 scales. The scores for the composite 
scales are the sum of the points for each scale 
included in the composite scale, listed below. 

1. Suppression and Distance (7 scales) : Avoid- 
ance of communication, Suppression of sexuality, 
Ascendance of parent, Encouraging verbalization 
(inverted), Approval of activity (inverted), Com- 
radeship and sharing (inverted), and Autonomy 
(inverted). 

2. Unhappiness at Home (titled “Rejection of 
homemaking role" by Schaefer and Bell) (5 
scales) : Encouraging verbalization, Approval of 
activity, Marital conflict, Irritability, and Rejection 
of homemaking role. 

3. Demand for Striving (5 scales): Breaking 
the will, Strictness, Deification of parent, Approval 
of activity, and Excluding outside influences, 

4. Overpossessiveness (5 scales): Suppression 
of aggression, Suppression of sexuality, Intrusive- 
ness, Fostering dependence, and Harsh punishment 
(inverted). 

5. Harsh Punitive Control (8 scales) : Breaking 

the will, Strictness, Deification of parent, Harsh 
punishment, Excluding outside influences, Irrita- 
bility, Seclusiveness of parent, and Ascendance of 
parent. 
Five scales included by Schaefer and Bell in these 
composite scales, but not included in the Iowa 
selection were as follows: Ignoring the baby, in 
1 and 2; Abdication of parental role, in 2; Accel- 
eration of development, in 3; Martyrdom, in 3 
and 4; and Infantilization, in 4. Five scales in- 
cluded in this study that did not enter into the 
factorial structure described by Shaefer and Bell 
were Equalitarianism, Deceit of the child, Express- 
ing love and affection, Considerateness of spouse, 
and a scale called Dependence of mother for the 
mothers’ version that is more appropriately titled 
Disapproval of ascendance of mother in the 
fathers’ version. 


RESULTS 


Developmental Characteristics of Child 
Social Behavior Measures 


This investigation explored preschool 
children's use of language and hostility dur- 
ing child directed play in preschool groups. 
Dramatic play, the acting out of the real 
and imaginary scenes of life encountered 
by children, is a frequent characteristic of 
this play and was used in this study as the 
basis of one classification of use of language 
and hostility. The list of dramatic play 
topics for the preschool groups of this study 
is, then, a definition as well as a finding. 


Eight dramatic play topics were played 
sometimes or often in all five preschool f 
groups observed in the spring of 1957, 
These were the following, in the order of 
frequency of occurrence within groups: 

House and family 

Cowboy or “western” lore 

House construction 

Road construction 

Animals that crawl and growl 

Doctor and nurse 

Trains 

Modern police 
The western scenes resembled television 
westerns in which people, rather than 
animals, are shot. The police sirened around 
on tricycle squad cars or directed traffic. 
The animals that crawled and growled and 
acted as bears are supposed to act were 
seldom bears in name; instead in the various | 
groups these animals were called tigers, | 
wolves, wildcats, crocodiles, lions, hound 
dogs, rabbits, donkeys, and even whales. 


Dramatic play topics recorded in two to 
four of the groups observed were the fol- 
lowing, also in the order of frequency 
within groups: 

Automobiles, garages, and gas stations 

Parties and weddings 

Tunnels in mountains 

Grocery store 

Boats and water travel 

Airplanes and airports 

Construction of castles and bridges 

Fire engines and fires 

Death 

Getting and counting money 

Zoos and cages 

Basketball games 

Witches 

Dynamiting 

Swimming 

Putting on a play 


The following were observed as dramatic 


play topics in only one group: 
Modern warfare 
Sunday school 
Island in the ocean 
Boss of the job 
Santa Claus 
Farming 
Photography 
Water skiing 
Caves 
Priest 


As these lists make evident, only night life 
and a few adult privileges and responsibili- 
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| ties escaped the scope of the play of these 


children. Geography was not excluded ; play 


was usually located by some domestic or 


foreign place name. 
When the children spoke or showed hos- 
tility in acting out a role dealing with any 


_ of the listed topics, this behavior was classi- 
fied as dramatic play use of language and 
hostility. All other use of language and 


“ 


hostility was classified as reality use. 


Dramatic Play and Reality Use of 
Language and Hostility 


The mean percentage of observation rec- 
ords in which categories of dramatic play 
and reality use of language occurred for 
each age and sex group is shown in the 
figures presented in this section. АП figures 
are drawn on the same scale, with 60% as 
the largest percentage. 

Dramatic play and reality suggestion and 
agreement, shown in Figure 1, were the 
most frequent uses of language. The resem- 
blance between dramatic play and reality 
Use of suggestion and agreement is limited 
to that statement, however. 

Dramatic play use of suggestion and 
agreement increased as the age of the child 
increased, as is shown in Figures la and 
lb, while reality use of suggestion and 


agreement failed to change with age, as is 


shown in Figures 1с and 1d, These findings 
indicate that increased use of language in 
dramatic play accounts for children’s greater 
talkativeness with peers as age increases, 
and that reality use of language is relatively 
unimportant in this increased use of lan- 
guage. This difference in the effect of age, 
and the difference in reliability reported in 
the methods section, were the first indicators 
їп this investigation that language and hos- 
tility used in dramatic play and used in 


— reality talk as self might be new and oppo- 


Site variables. 

Children's dramatic play often is de- 
scribed as a fantasy activity that expresses 
individual wishes and desires. The develop- 
mental difference for dramatic play use of 
language is not in agreement with the idea 


of expression of individual wishes and 


desires, 


Within the past 5 years, Harris (1957) 
claimed as a new discovery the idea that 
some behavior of children does not change 
with age. The reality suggestion and agree- 
ment data are in line with this idea, which 
is still too new to have further assessment 
of meaning. Hence, differences in other 
relations between the developmental use of 
language, dramatic play use, and the reality 
use of language, unaffected by age, may 
elucidate differences in behavior affected 
and not affected by the increasing age of 
the child. 

Age increases in the frequency of use of 
dramatic play suggestion and agreement 
were significant beyond the .001 level in 
factorial design age by sex analyses of 
variance. Age differences in frequency of 
reality suggestion were not significant. Dif- 
ferences in use of reality agreement were 
not significant between the three elder age 
groups. The youngest children, those aged 
24-34 years, used agreement less (.01 level) 
when they talked as themselves than the 
three older age groups. 

Boys used dramatic play language more 
frequently than girls, but sex differences in 
use of reality language either did not exist 
(suggestion), or were in the opposite direc- 
tion (agreement), as is shown in Figure 1. 
The effect of age is shown to be about the 
same for boys and girls. The sex difference 
favoring boys was significant beyond the 
001 level for both dramatic play suggestion 
and agreement. Girls’ use of reality agree- 
ment differed at the .05 level from that of 
boys. The age-sex interaction was nonsig- 
nificant for measures in both classifications, 

Large individual differences in the fre- 
quency of use of both dramatic play and 
reality suggestion and agreement were 
found in all age and sex groups, as is shown 
in Table 3. One boy in the 44-54 year 
group made dramatic play suggestions in 
86% of the observations, and obviously was 
engaged in dramatic play during most of 
the observation time, No dramatic play use 
of language was recorded for one girl in 
the 34-44 year group. Similar ranges oc- 
curred for children's use of reality sugges- 
tion and agreement with other children. 
There were only four zero percentages for 
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TABLE 3 


STANDARD DEVIATION AND RANGE OF THE PERCENTAGE OF OBSERVATION RECORDS IN WHICH DRAMATIC 
PLAY AND REALITY SUGGESTION AND AGREEMENT WERE USED IN EACH AGE AND SEX GROUP 


Dramatic play Reality 
Age and sex groups Suggestion Agreement Suggestion Agreement 
| SD Range SD Range SD Range SD Range 
215-316 year 
girls 6.6 0-19 4.8 2-16 8.7 18-45 3.6 16-26 
boys 16.5 5—59 7.3 1-22 22.7 14-78 15.8 19-43 
344-414 year 
girls 18.9 0-71 14.0 0-53 15.9 0-60 9.6 12-50 
boys 23.7 3-66 18.9 6-57 17.5 2-67 11.4 11-45 
_ 414-514 year 
- girls 16.6 7-68 11.8 10-43 18.8 10-78 13.5 12-57 
boys 15.2 30-86 13.4 22-75 11.2 23-61 8.0 21-47 
514-615 year 
girls 1.1 19-57 10.7 10-45 6.0 35-52 4.6 35-50 
boys 13.5 18-76 12.0 28-70 10.7 23-56 6.7 17-44 


| these measures, however. Three zero scores 
py теге for the girl just mentioned, and the 
other was for dramatic play suggestion by 
_ a girl in the youngest age group. 
Imitation of the words and sounds of 
other children was not a frequent behavior 
of the observed children, as is shown in 
- Figure 2. Six girls and one boy failed to 
_ use any imitative language in dramatic play 
during the observation periods. Ten girls 
and 20 boys, scattered through all age 
groups, did not use any imitative language 
in talking as themselves with other children. 
The slightly larger percentage of imitation 
for children aged 34-44 years, shown in 
Figure 2, was significantly larger (.01 level) 
only for reality imitation in analysis of 
Variance tests. 

Use of dramatic play hostility resembled 
use of dramatic play language in the char- 
| acteristic of increasing in frequency with 
аве, while use of reality hostility resembled 
- use of reality language in the characteristic 
_ 0f little or no change with age, as is shown 
in Figure 3. When friendly and_hostile 

havior, given the additional classification 
» of dramatic play or reality behavior, do not 

differ in relations with another variable, age 


UM el ee 


in this instance, the additional classification 
can be described as broader in scope than 
the friendly-hostile dichotomy. These data 
add the descriptive word “general” to the 
previously suggested possibilities of "new" 
and “opposite” for the dramatic play and 
reality classifications. 

The age group differences in frequency 
of dramatic play hostility were significant 
beyond the .001 level in analysis of variance 
tests, while the frequency of reality hostility 
did not differ significantly in the four age 
groups. Somewhat surprisingly, then, the 
high level of reality hostility for 44-54 year 
boys, shown in Figure 3, did not differ 
statistically from the levels of the other age 
and sex groups. The similarity of mean 
percentages for other age and sex groups 
obviously served to counterbalance this 
single age and sex group deviation in the 
analysis of variance tests. Although all age 
and sex groups were equally variable by 
Rartlett’s test, the standard deviation for 
the 44-55 year boys, shown in Table 4, 
suggests that boys in this group varied less 
from their unusually high mean occurrence 
of reality hostility than boys in other age 
groups. 
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Boys used both classifications of hostility 
more frequently with other children than 
irls. This finding agrees with results of 
all studies of aggressive behavior of pre- 
school children. The sex difference was 
significant at the 1001 level for dramatic 
play hostility, and at the .025 level for 
reality hostility in analysis of variance tests. 
Reality hostility was used more frequently 
than dramatic play hostility in all age and 
sex groups, as is shown in Figure 3, a differ- 
ence not found for language use. The rela- 
tive proportions of reality and dramatic play 
hostility, respectively, were two to one.for 
boys and three to one for girls. 

Individual differences in frequency of use 
of reality hostility covered almost all the 
percentage range, but a narrower range was 
found for the less frequent use of dramatic 
play hostility, as is shown in Table 4, The 
upper range extremes for reality hostility 
of 92-97% indicate that it was a rare 5 
minutes for one girl and several boys past 
34 years of age when they did not show 
reality hostility to other children. The lower 
range extreme was 4% occurrence for a 


“girl and a boy in the 34-44 year groups; 


it was an unusual 2 minutes for these two 
children when they used reality hostility 
with peers. There were no children who 
failed to 'show reality hostility, but some 
children did not use dramatic play hostility. 
No dramatic play hostility was observed for 
four boys and four girls in the youngest 


STANDARD DEVIATION AND RANGE or THE PE 
PLAY AND REALITY HosrILITY 


HOME EXPERIENCE AND PLAY WITH PEERS 


13 


age group, and for one girl in the oldest 
age group. 

Children's use of language to greet, or to 
say, “Hello,” to other children and adults, 
was predominantly in talk as themselves 
rather than in dramatic play. Reality recog- 
nition of the presence of others occurred 
in about one-fifth of the observation records 
for all age and sex groups as is shown in 
Figure 4a, while dramatic play use of 
greeting occurred їп fewer than 696 of the 
records for any age or sex group. 

Relations with age and sex for use of 
greeting in the reality and dramatic play 
classifications resembled those found for 
these classifications of hostility and other 
language use. Frequency of use of reality 
greeting did not change with age and sex 
group, in analysis of variance tests, but use 
of dramatic play greeting increased with 
age (.005 level), and was more frequent for 
boys (.005 level). 

Children asked twice as many reality 
questions of children and adults as dramatic 
play questions. The percentages for occur- 
rence of reality questions, shown in Figure 
4b, agree with the findings on frequency 
of questions reported by Day (1932), Davis 
(1937), McCarthy (1930), and Smith 
(1933). Questions constituted 10-15% of 
the children’s responses in these studies. The 
basis of the percentages presented in Figure 
4b is number of 2-minute records rather 
than number of responses, however. Analy- 


TABLE 4 


RCENTAGE OF OBSERVATION Recorps IN WHICH 
Were Usen IN EACH ‘AcE AND SEX GROUP 


DRAMATIC 
Dramatic play hostility Reality hostility 
BL ELE EE 
Age group Girls Boys Girls Boys 
SD Range SD Range SD Range SD Range 
216-316 years 3.73 0-9 7.12 0-22 13.74 7-46 21.49 10-73 
316-416 years 7.81 0-23 16.07 0-61 21.06 4-92 19.24 4-87 
416-516 years 22.61 0-81 12.49 16-58 14.86 9-59 15.35 34-94 
515-616 years 6.24 0-22 10.05 8-40 14.52 12-52 17.81 19-93 
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Fic. 4. The mean percentage of observation 
records in which reality greeting and question were 
used in eight age and sex groups. 


sis of variance and ¢ tests for reality ques- 
tions indicated the only age and sex differ- 
ence to be that 24-34 year boys asked more 
reality questions (.005 level) than any other 
age and sex group. 

Dramatic play questions were asked in 
a mean of 6% or fewer records in all age 
and sex groups, and were not asked at all 
by 12 boys and 24 girls. Frequency of 
dramatic play questions increased with age 
(.005 level) and was greater for boys (.005 
level) by analysis of variance tests. 

The preceding description of frequency 
of use of the various categories of language 
has been limited to children's use of lan- 
guage during friendly approach interactions. 
Comparable data on frequency of associa- 
tion use of language and conversation use 
of language are not presented here. Friendly 
approach use of language with peers related 
closely to association use of language and 
to conversation use of language, as is shown 
in Table 5. Use of dramatic play suggestion 
in all three social interaction categories was 
almost identical. This degree of resem- 


blance was greater (.01 and .05 level) than. 


found for use of either reality suggestion " 


or reality agreement during all three types 
of social interaction. Nevertheless, the r's 
for reality use of language are both large 
and significant. These data suggest that 


TABLE 5 


INTERRELATIONS OF ASSOCIATION, FRIENDLY 
APPROACH, AND CONVERSATION INTERACTION 
USE ОЕ LANGUAGE AS SHOWN IN AVERAGE 
CORRELATIONS FOR THE AGE GROUPS OF 
GirLs AND Boys 


Dra- 
matic | Reality | Reality 
Interactions play sug- agree- 
correlated sug- | gestion | ment 
gestion 
Association and .96* .75+ TM 
friendly approach 98% 87% 85" 
Association and .94* .83* .83* 
conversation .86* .77* 68% 
Friendly approach .95* .79* .qA* 
and conversation E 73* 60% 


Note.—Boys’ average r's in italics. 
* Significant at .01 level. 
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distinction of use of language according to 
the friendly interaction category in which 


"itis used, results in relatively similar meas- 


ures for each child in each interaction 
category. 


Aggression, Submission, and Dominance 


The scores for positive aggression, sub- 
mission, aggression, and dominance were 
defined as combinations of measures in the 
dramatic play and reality classifications. The 
findings presented in earlier parts of this 
section suggest that these scores combine 
measures of different variables, and that 
relations between these scores and the age 
and sex of the child will depend on the 
relative proportions of dramatic play and 
reality measures in the combined score. 

Age differences for the combined scores 
were limited to a difference for one age 
group, 44-54 year boys and girls, and to the 
scores that included measures of hostility. 
Children aged 44-54 years showed more 
aggression and dominance than children of 
other ages. As was shown earlier in Figure 
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3, this age group had higher percentages of 
use of both dramatic play hostility and 
reality hostility than other age groups. Both 
classifications of hostility were combined 
with use of language measures in aggression 
and dominance scores. No age differences 
were found in analysis of variance tests for 
the scores that combined only use of lan- 
guage measures, positive aggression, and 
submission. 

All four combined scores were larger for 
boys than for girls. As has been reported, 
boys had higher percentages than girls for 
use of dramatic play language and of both 
dramatic play and reality hostility. 


Social Interactions with Children ` 


Interactions between children were the 
necessary basis for use of language and hos- 
tility with other children. The mean number 
of interactions in 2 minutes for each age 
and sex group is shown for the friendly 
approach and hostile categories in Figure 5, 
and for the association and conversation 
categories in Figure 6. 
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Age and sex differences in social inter- 
action are clearer when the data in Figures 
5 and 6 are translated into terms of fre- 
quency of interaction. Boys in the 24-34 
year group had almost one association and 
one friendly approach interaction with other 
children in each 2 minutes, but had a con- 
versation with another child less frequently, 
once in 18 minutes. Boys in the 54-6} year 
group had two association and two friendly 
approach interactions with other children 
each 2 minutes, and held a conversation with 
another child about every 5 minutes. The 
youngest group of girls had an association 
interaction with another child every 6 min- 
utes, a friendly approach interaction every 
3-4 minutes, and a conversation with an- 
other child once in 40 minutes. In each 
2 minutes, the oldest group of girls had an 
association interaction with one and a half 
children and a friendly approach interaction 
with two children; they held a conversation 
with another child about every 5 minutes. 
Tests of age differences were significant 
beyond the .001 level in analyses of variance 
for the three friendly classifications. 

The mean number of hostile interactions 
for the age and sex groups, shown in Figure 
5, are those that would be expected from 
the previous description of mean percent- 
ages of use of dramatic play and reality 
hostility. For the measures of number of 
hostile interactions, however, the unusually 
high level of hostility shown by 44-54 year 
boys differed (.01 level) from the level of 
hostility of boys of other ages and of girls 
of all ages. This was noted earlier for use 
of reality hostility, but the difference for 
that measure was not statistically significant. 
The range of mean hostile interaction in 
2 minutes for 44-54 year boys was from 
.70 to 1.32, and the standard deviation was 
AVA 

The author’s impression of almost all 
boys of this age, regardless of school group, 
was that they were noisy, rough, slam bang 
“shooters” and “clobberers,” and that they 
were regarded as public enemies by the girls 
in the groups during most of the observed 
play. Most of the relatively self-controlled 
and placid boys in the 51-6} year group 
had attended the preschools the previous 


year; by teachers’ reports, the year before 
they had been as excited and excitable as 
the 44-54 year boys of this study. Freudian 
theory suggests that boys may be solving 
the Oedipal complex at about this age. The 
hostile behavior of these boys suggested 
they were solving some life problem or were 
generally frustrated. The small size of the 
standard deviation and range suggest that 
their high level of hostility was a problem 
of development or of culture, and was not 
due to large individual differences in the 
boys’ behavior within this age group. 

Sex differences were significant beyond 
the .001 level for hostile interactions, be- 
yond the .005 level for association inter- 
actions, and beyond the .05 level for f riendly 
approach interactions. "This finding dis- 
agrees with the finding of no sex difference 
by Marshall and McCandless (1957a) for 
the social interactions of 18 boys and 18 
girls attending an Iowa preschool. Mean 
social interaction scores for the same cate- 
gories from the two studies are listed in 
Table 6. Data for Kentucky children aged 
34-54 years are used rather than that for 


e 


кас 


all children. This limitation made ages of ^ 


the Kentucky and Iowa groups comparable, 
but not matched. The greater amount of 
social interaction for the Kentucky children 
may indicate that these groups included 
more older children. The author collected 
the majority of records in both studies, so 
there should have been no differences in 


TABLE 6 


MEAN SOCIAL INTERACTION SCORES ОЕ Bovs AND 
GIRLS IN KENTUCKY AND Iowa Stupres 


Social interaction scores 
Sex groups 
Association + 
friendly Hostile 
approach 
Kentucky, 314-514 years 
35 girls 2.80 .58 
35 boys 3.27 .78 
Iowa, 4-1 to 5-7 years 
18 girls 2.24 32 
18 boys 1.84 .38 
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observation method or observers. It is 
likely, however, that the occurrence of sex 
differences in friendly social interaction 
scores depends on the individuals in, or the 
constitution of, the groups. 


Social Acceptance by the Group 


Picture sociometric scores, the measures 
of children's popularity within the preschool 
group, did not differ with the age or sex 
of the child. This finding could have been 
predicted from the data just presented and 
the relations reported by McCandless and 
Marshall between sociometric and social 
interaction scores. In the Towa study, girls 
had significantly higher sociometric scores 
than boys, but did not differ from boys in 
the number of friendly interactions. The 
boys had significantly more friendly inter- 
actions in the present study, but did not 
differ from the girls in popularity. 


Test Vocabulary and Aggression 


The children of this study had a mean 
vocabulary age that was 2 years beyond 
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their chronological age, as is shown in age 
and sex group mean scores on the Stanford- 
Binet Vocabulary test presented in Figure 7. 
Age, but not sex, differences were signifi- 
cant (.001 level) in the analysis of variance 
test. 

The boys and girls in this study showed 
а decrease in test aggression toward the 
experimenter with increasing age, with the 
exception of 44-54 year boys. The inter- 
action between sex and age was significant 
(.025 level) in analysis of variance test. It 
is shown in the mean test aggression scores 
for the age and sex groups presented in 
Figure 8. 


Summary 


Two classifications of measures developed 
to explore children’s use of language and 
hostility in child directed play of preschool 
groups entered into different relations with 
the age and sex of the child. An increase 
with age was found in the frequency of 
use of five categories of dramatic play lan- 
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guage and hostility: suggestion, agreement, 
hostility, greeting, and question. Frequency 
of use of these five categories in the reality 
language and hostility classification did not 
change as the age of the child increased. 
Boys used dramatic play language and hos- 
tility more frequently than girls. The sexes 
did not differ in use of reality language, 
with one exception; girls used reality agree- 
ment more frequently than boys. Boys 
showed reality hostility more frequently 
than girls. 

Four scores for aggression, submission, 
and dominance (combinations of measures 
of use of language and hostility) were 
larger for boys than for girls. The scores 
did not increase in size as age of the child 
increased. Higher scores for aggression and 
dominance were obtained for children aged 
44-54 years. This age difference resembled 
nonsignificant trends for dramatic play and 
reality hostility. 

Friendly interactions between children 
increased with age and were more frequent 
for boys than girls. The only age difference 
for hostile interactions was that boys in the 
44-54 year group had an unusually large 
number of hostile interactions. Boys had 
more hostile interactions than girls. 

Social acceptance in the preschool groups 
did not differ with age or sex of the child. 
Children in all age groups had a mean 
vocabulary age that was about 2 years 
beyond their chronological age. Test aggres- 
sion decreased as age increased, except that 
43-54 year boys showed almost as much test 
aggression as 24-34 year boys. 


Relations between Measures of 
Children’s Social Behavior 


Age and sex differences have been re- 
ported between the dramatic play and 
reality classifications of use of language and 
hostility with peers. These differences sug- 
gest that the two classifications of use of 
language and hostility may be different 
variables. 

Dramatic play and reality use of language 
and hostility can be described as disparate 
variables if: 


1. The measures in each category are 
positively related, and either are not related 
or are negatively related to measures in the { 
other category 

2. The two categories of measures enter 
into different relations with measures for 
other variables 
Both types of relations are included in the 
correlations between measures of preschool 
social behavior described in this section. 


Relations testing the assumption that use 
of language and hostility is involved in 
influencing and adjusting to age peers are 
described in this section, also. These are 
the correlations obtained between measures 
of dramatic play and reality use of language 
and hostility and the scores for social par- 
ticipation and social acceptance of the 108 
children. 

Age and sex differences in the frequency 
of use of language and hostility indicated 
that correlations for these frequency meas- 
ures should be done separately for each age 
and sex group. Therefore, most correla- 
tions presented in this report are averages 
of product-moment r’s for each age and sex ^ 
group, obtained by use of the r to 2 trans- 
formation. Significance of average 7’s, and 
of differences in r's, were determined for. 
the z's of the r’s, as described in McNemar 


(1955). 


Use of Dramatic Play and Reality Language \ 
and Hostility 


Evidence that measures in the dramatic 
play classification are positively related to | 
each other, and that measures in the reality .. 
classification also are positively related to 


? Almost half of the statistical computations of 
this investigation were performed on the IBM 
650 computer of the Computing Center of the 
University of Kentucky, of which J. W. Hamblen 
is Director. Assistance on computations performed 
on desk calculators was given by four graduate 
assistants: Joann Atcher, Rohini Doshi, Rachel C. 
Graves, and Hazel McCrary; and by the following 
undergraduate students: Mary Robert Barger, 
Alice Evenburgh, Dixie Grugin, Carolyn Houston, 
Barbara Landrum, Lynne Santen, Ruth Thornton, - 
Joyce Ann Wood, and Betty Young. 


пе other, is presented in Table 7. Within 
` each. classification, the children who made 
< the most suggestions were the children who 
agreed more frequently, and who showed 
‘more hostility toward peers. 

It is generally believed that children who 
make the most suggestions during play with 
peers are different individuals from those 
who most frequently agree with the sug- 
gestions of others. The close relations be- 
tween suggestion and agreement within both 
the dramatic play and the reality classifica- 
tions contradict this belief. These findings 
indicate that the "ascendant" child, who 
makes the most suggestions, is also the 
“submissive” child, in the sense that he 
most frequently agrees with the suggestions 
of others in dramatic play or in talk as 
himself with peers. 

Use of language and hostility during 
dramatic play did not relate to use of lan- 
guage and hostility during reality play, as 
is shown in most of the correlations between 
classifications listed in Table 7. The chil- 
dren who talked most frequently during 
dramatic play were not necessarily, then, 

> the children who used reality language and 
hostility most frequently in talk with peers. 
A clear implication of these data is that 
ithe descriptions of “talkative” or “highly 
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verbal" cannot accurately describe children's 
use of language and hostility during both 
dramatic play and talk as themselves with 
peers. 

These findings strongly support the con- 
ception of the dramatic play and reality 
classifications as opposite or disparate vari- 
ables. They meet the first requirement for 
such description presented in the introduc- 
tion to this section : positive relations within 
each classification, and either no relations 
or negative relations between the two 
classifications. 

The positive relations between friendly 
and hostile behavior within each of the 
dramatic play and reality classifications 
support an idea presented in the preceding 
section, that the dramatic play and reality 
classifications constitute a broader, or more 
general dichotomy than that of friendly and 
hostile behavior. 

Relations for two measures that were 
not duplicated in both classifications, dra- 
matic play imitation and reality greeting, 
constitute the exceptions to positive relations 
within classifications, as is shown in Table 
7. The use of reality greeting with children 
and adults failed to relate to any measure 
of use of language and hostility in either 
the dramatic play or reality classification. 


TABLE 7 


AVERAGE CORRELATIONS OF THE EIGHT AGE AND Sex Groups BETWEEN MEASURES ОЕ USE ОР 
DRAMATIC PLAY AND REALITY LANGUAGE AND HosriLITY 


————— 
Dramatic play Reality 
sa Measures of po 
language and 
hostility Sugges- | Agree- | Imita- Hos- | Sugges- | Agree- Hos- Greet- 
tion ment tion tility tion ment tility ing 
E нга, pov c] 
Dramatic play: 
Agreement .65** 
Imitation .17 .21 
Hostility .59** = = 
Reality: 
Suggestion .315* .03 —.34** | —.04 
Agreement 17 13 =.27* —.24 .63* 
Hostility .07 — — .13 47% 19 
Greeting 01 =. € ‘00 | .07 .00 .05 
i .08 = == —.05 .54** .46** „13 .12 


ignificant at .05 level. 
Significant at .01 level. 
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The other measure combining use with 
peers and adults, reality question, resembled 
measures of use of reality language and 
hostility with peers. Correlations were not 
computed between all measures and dra- 
matic play agreement and imitation, as is 
indicated by dots in Table 7. 

There are three exceptions in Table 7 to 
the lack of relations between classifications. 
One is a positive relation between dramatic 
play suggestion and reality suggestion. This 
average r is significantly smaller than the 
average r’s between dramatic play sugges- 
tion, agreement, and hostility, and between 
reality suggestion and agreement. It is not, 
then, in real contradiction to the general 
trend. The other exceptions are that dra- 
matic play imitation correlated negatively 
with reality suggestion and reality agree- 
ment. 

The sexes did not differ in relations 
between measures of use of language and 
hostility, with one exception : use of reality 
hostility did not relate in the same way to 
other measures for boys and girls, as is 
shown in Tables 8 and 9. In 10 of the 12 
comparisons in which the average r was 
significant for either or both boys and girls, 
sex differences in average 7's were not sig- 


TABLE 8 


AVERAGE CORRELATIONS OF THE FOUR GIRLS' AGE GROUPS BETWEEN MEASURES OF USE OF 
Dramatic PLAY AND REALITY LANGUAGE AND HOSTILITY 


nificant. Use of language and hostility dür- 
ing dramatic play and reality play had, with 
one exception, the same meaning (inter, 
relations between measures) for these boys 
and girls, although boys used dramatic pla 
language and hostility more frequently than 
girls. 

The sex difference (.05 level) in relations 
for use of reality hostility is puzzling. For 
girls, reality hostility increased as dramatic 
play suggestion increased, as is shown in 
Table 8, while for boys, this relation was 
in the opposite direction, and not significant 
as is shown in Table 9. For boys, reality 
hostility increased as dramatic play hostility 
increased, while for girls this relation was 
in the opposite direction and not significant. 
The more frequently girls made suggestions’ 
during dramatic play, then, the more fre- 
quently they showed hostility during dra- 
matic play and in play as themselves, 
although there was no relation between dra- 
matic play hostility and reality hostility. As 
boys made suggestions more frequently in 
dramatic play, they showed dramatic play| 
hostility more frequently, and less fre- 
quently showed reality hostility in play as 
themselves, although boys high in dramatic 
play hostility tended to be boys high i 


Dramatic play Reality 
Measures of 
language and 
hostility Sugges- | Agree- | Imita- Hos- Sugges- | Agree- Hos- 
tion ment tion tility tion ment tility 
Dramatic play: 
Agreement .T3** 
Imitation Е, 14 
Hostility .54* = = 
Reality: 
Suggestion .40* .26 —.34* .20 
Agreement 3 .20 —.45* —.25 .70** 
Hostility .41* = — —.11 .58** .05 
Greeting — .36* — — —.01 .01 .07 —.04 
Question —.17 — — .04 41* .48** .18 


* Significant at .05 level. 
** Significant at .01 level. 
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4 ТАВГЕ 9 


AVERAGE CORRELATIONS OF THE Four Boys’ AGE GROUPS BETWEEN MEASURES ОЕ UsE ОЁ 
DRAMATIC PLAY AND REALITY LANGUAGE AND HOSTILITY E 


Dramatic play Reality 
Measures of 
language and 
hostility Sugges- | Agree- Imita- Hos- Sugges- | Agree- Hos- Greet- 
tion ment tion tility tion ment tility ing 
ramatic play: 
Agreement .58** 
Imitation 13 .26 
Hostility .62** — — 
Reality: 

Suggestion .23 —.15 —.34* | —.21 
Agreement ‚21 .08 | —.10 —.22 .55* 
Hostility —.18 = c .45** .38* .29 
Greeting —.08 = = .00 11 —.07 11 
Question .26 — — —.11 .63** .45** .10 .01 


* Significant at .05 level. 
- ** Significant at .01 level. 


reality hostility. These confusing everyday 
language statements of the sex differences 
in relations are in line with relations to be 
reported for other aspects of social be- 
havior. They suggest a sex difference in the 
meaning of use of reality hostility. 
` The preceding description of relations 
etween the various categories of language 
use has been limited to use of language dur- 
7 friendly approach interactions. Similar 
lations were obtained for use of language 
during association, friendly approach, and 
conversation interactions, as is shown in 
Table 10. Findings presented in the develop- 
ental differences section suggested that 
the relative position of the child in the age 
group was about the same for frequency of 
ise of language during the three types of 
friendly interaction. These data suggest that 
‘relations between categories of use of lan- 
guage are about the same during the three 
es of friendly interaction also. 


"Use of Language and Hostility and Social 
Interaction Scores 
Children's use of language and hostility 


carry out roles in dramatic play related 
to the number of children with whom 


and girls between dramatic play suggestion 
and hostility and the friendly interaction 
scores. These findings give strong support 
to the assumption on which this investiga- 
tion is based: that use of language and hos- 
tility is involved in influencing and adjusting 
to age peers during play in preschool groups. 


TABLE 10 


INTERRELATIONS OF THREE UsE ОЕ LANGUAGE 
CATEGORIES DURING THREE TvPES OF FRIENDLY 
INTERACTION FoR GIRLS AND Boys 


Average r between: 


Type of Dramatic | Dramatic 
friendly play play Reality 
interaction | suggestion | suggestion suggestion 
and and and |: 
reality reality reality 
suggestion | agreement | agreement 
Association AT .29 КЫ 
ald. .20 .68* 
Friendly .40* 11 .70* 
approach , .23 21 Зэк 
Conversation .53* .24 57.74% 
.51*: .44* Ud 


Note.— Boys' average r's in italics, 
* Significant at .01 level. 


| 


TABLE 11 
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AVERAGE CORRELATIONS BETWEEN MEASURES or USE or LANGUAGE AND HosriLITY 


AND FRIENDLY INTERACTIO; 


N SCORES FOR GIRLS AND Bovs 


Friendly interaction scores 
Measures of use of m ТОР 
age and hostilit riendly rii y 
"i dade У Association approach Conversation (A РЕА +C) 
Dramatic play: 
Enc -20** .85** NL .81** 
.34*+% „б5+* .б8*ж „б7жж 
Hostility .56** .60** 50% 614% 
.б7жж ‚73** .7б*ж ‚7б** 
Reality: ij 
Suggestion 217% .A6** .52** .29 7 
—.36%^ 12 -10 =e) 
Agreement 227 —.06 .28 —.14 
j 10 18 .00 —.02 
Hostility .25* S148 An 39 8 
—.27* —.10* —.14^ —.23^ 
Greeting .02 —.03 —.07 —.14 
=>. -08 42. m2. 
Question —.27 .01 —.06 —.24 
] Ed .19 ‚16 14 


Мое.——Ноуз' average r'sin italics, _ 
* Sex difference in average r's is significant at .05 level. 
* Significant at .05 level, 

** Significant at (01 level; 


Support for this assumption was not 
provided by relations between reality use of 
language and hostility and the friendly 
interaction scores shown in Table 11. Use 
of- reality language and hostility did not 
relate to extent of social participation with 
peers at preschool for boys. For girls, most 
categories of use of reality language also 
lacked this relation, but their use of reality 
suggestion and reality hostility in talking 
with other children as themselves increased 
as the number of social interactions in- 
creased. The latter average r’s for girls are 
significantly smaller (.05 level), however, 
than those listed in Table 11 between dra- 
matic play suggestion and hostility and all 
friendly interaction scores except conversa- 
tion. . 

These findings, then, are that use of 
dramatic play language and hostility. in- 
creases as the number of friendly inter- 
actions with other children increases, but 
use of reality language and hostility does 
not relate to the number of friendly: inter- 


actions with other children, except in a few 
instances for girls. The findings support the 
conception of dramatic play and reality 
classifications as Opposite or disparate vari- 
ables. They meet the second requirement 
for such description presented in the intro- 
duction to this section: the two categories of 
measures enter into different relations with 
measures for other variables, 

Additionally, these relations provide fur- 
ther evidence that the descriptions of “talka- y 
tive” or “highly verbal” have confused 
meaning when applied to children’s use of 
language during play with peers. 

The sex difference in relations for reality 
hostility Supports the idea, presented earlier 
in this section, of a difference in the mean- 
ing of use of reality hostility for the two 
sexes. For girls, use of reality hostility has 
been shown to increase as use of dramatic 
play language increases, and as the number 
of friendly interactions with other childfe 
increases. These relations were not foi 
for- boys, although boys’ use of drama 


and 
of hostile 
increased. 
boys were .95 and .69, 
reality hostility and hostile interactions, and 
were .71 and 60, 
dramatic play hostility and hostile inter- 
actions. These r's are larger 
than 15 of the 16 corresponding rs 
scores for friendly interactions. shown in 
a closer 


Table 11. Such findings indicate 
relation for both sexes between the fre- 
21 quency of use of the two classifications of 


hostility and number of hostile interactions 
than between the two classifications of hos- 
tility and number of friendly interactions. 


of hostility 
hostile interactions, 
with friendly interaction scores. 
tions were not determined between use of 
language measures and hostile interactions. 
É Friendly and hostile interactions with 
^peers were positively related for these chil- 
dren, The average 7’S for girls and boys 
were .65 and .51, respectively, between all 
friendly interactions (association 4- friendly 
approach + conversation) and hostile inter- 
actions. This direction of relations agrees 
with the positive relations between use O 
language and hostility within both the dra- 
matic play and reality classifications. 
these relations. suggest that friendly and 
constitute a di- 
chotomy of “opposites.” Additionally, they 
, are in line with such popular statements а5: 
“Ошу. friends quarrel,” oF “You always 
hurt the one you love.” An age difference 
in relations for boys indicates that the re- 
lated behaviors probably are not the same, 
| however, as the love and hate ambivalence 
|— of one of the Freudian mechanisms of iden- 
tification. The average 4 between friendly 
and hostile interactions was 74 for boys 
younger than 4X years, and it was .26 for 
boys older than 44 years. 
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Use of Language and Hostility and S ocial 
Acceptance 


The children who talked more frequently 
in carrying Out the roles of dramatic play 
were chosen more often as preferred play- 
mates by their preschool peers than children 
who were observed to use dramatic play 
language less frequently. This is shown in 
the significant, positive average r's in Table 
12 between sociometric test scores and meas- 


TABLE 12 


AVERAGE CORRELATIONS BETWEEN SOCIOMETRIC, 
VOCABULARY, AND AGGRESSION TEST ScoRES AND 
OBSERVATION MEASURES кок GIRLS AND Boys 


Test scores 
Observation pc се кше sce 
measures 
Socio- | Vocabu- Aggres- 
metric lary sion 
Euer 
Dramatic play 
language: 
Suggestion A0* —.13 —.18 
.46** 14 03 
Agreement 152** | —.10, —.07 
.34* 03 02 
Hostility —.01* —.30* —.12 
E lid A1 —.06 
Reality language: 
Suggestion 99 —.10 .25 
16 —.17 14 
Аргеетепї —.13 00] 26 
04 —.19 30* 
Hostility —.12 —.25 09 
—.12 —.11 .26 
Greeting —.24 5t — 
—.04 —.27* E 
Question —.15 05 — 
—.03 —.01 ca 
Social interaction: — 
Association 48** | —.10 — 
454% 05 — 
Friendly .40* —.22 = 
approach `63** | —10 — 
Conversation .30 24 — 
.64** .20 — 
Association + 49s | —.18 —48 
friendly ap- 59s 02 — 23 
proach 
conversation 
Hostile 13 —.14 05 
aoe .06 .09 


'Note.—Boys' average 7's in italics. 

a Sex difference in average r's is significant at. ‚05 level. 
ж Significant at .05 level. 

жж Significant at .91 level. 
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ures of use of dramatic play suggestion and 
agreement. In contrast, the frequency of 
children’s talk as themselves failed to re- 
late to this estimate of social acceptance 
in the preschool group. No average r’s be- 
tween sociometric scores and reality use of 
language and hostility in Table 12 are sig- 
nificant, and most are negative in direction, 

These relations distinguish the social ac- 
ceptability of the dramatic play and reality 
classifications. Frequency of children's use 
of dramatic play language and, for boys, use 
ОЁ dramatic play hostility, was involved in 
influencing and adjusting to age peers in 
a socially acceptable way during play in 
preschool groups, while frequency of use of 
reality language and hostility lacked this 
meaning or such relations. 

Social acceptance is included in almost 
all lists of desired or desirable character- 
istics of children and adults. By inference, 
then, dramatic play use of language and 
hostility can be described as desirable be- 
havior for children, while reality use of 
language and hostility may be said to lack 
this meaning. 

The measures that related to social accept- 
ance of these children were the percentage 
of 2-minute Observation records in which 


may not relate to social acceptance in the 
Same way as dramatic play use of language. 
Girls’ use of dramatic play hostility did 


had negative, nonsignificant relations with 
girls’ social acceptance. i 
gest that both types of hostility increase at 
the same time as increases in friendly be- 


havior, but that the display of hostility by 
girls is not important in determining the 
social acceptability of their behavior to their 
Preschool peers. Their social acceptance 
depends in part, at least, upon the frequency 
and extent of their friendly behavior with 
peers, and is irrespective of their observable 
hostility to peers. 

Boys apparently had to “shoot their peers 
dead” to be popular. The correlation be- 
tween their use of dramatic play hostility 
and sociometric scores shown in Table 12 
is large, positive, and differs significantly 
from that for girls. The frequency of use 
of dramatic Play hostility predicted the 
social acceptance of one age group of boys, 
those aged 44-54 years, better than any 
other measure of social behavior; the r was 
.76 for this group of boys. This relation for 
dramatic play hostility for boys is in line 
with those described for both boys and girls 
in other relations for use of the dramatic 
play classification of measures. 

There were no sex differences in relations 
between frequency of reality hostility and 
sociometric scores, as is shown in Table 12. 
The display of hostility for their own 
behalf, then, did not affect the social 
acceptability of either boys or girls to their 
preschool peers, 

The correlations 
between hostile interaction scores and socio- 
metric scores in Table 12 are those that 
could be predicted 
scribed in the three 
The positive average r for boys is almost 
midway between the corresponding r’s for 
dramatic play hostility and reality hostility. 
For girls, there Was no relation between 
social acceptance and the number of hostile 
The z's for both 
Sexes are similar to the .36 and —.09 corre- 
lations between these scores reported by 
McCandless and Marshall (1957) for 18 
boys and 18 girls, respectively. 
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acceptable and, hence, “desirable” behavior 
for all preschool children. 

The correlations listed in Table 12 be- 
tween friendly interaction scores and socio- 
metric scores are not as large as those found 
for Iowa children by Marshall and McCand- 
less (19570), although the differences in 
ps between samples were not significant. 
Differences in age of the children in the two 
studies may contribute to the slightly smaller 
r's for the Kentucky children. Correlations 
for the 24-34 year group, an age year not 
included in the Iowa study, are somewhat 
smaller than those for the older age groups. 
The r’s for girls and boys aged 24-34 years 
were .29 and .38, respectively, between all 
friendly interaction scores and sociometric 
scores. 


Use of Language and Hostility and 
Vocabulary and Aggression Test Scores 


The vocabulary age of the child did not 
relate to his use of language and hostility 
with peers, or to the number of social inter- 
actions with peers, as is shown in Table 12. 
Social use of language with preschool peers, 
then, did not depend on the children's 
knowledge of vocabulary. These children 
had an unusually high mean level of vocabu- 
lary knowledge on the Stanford-Binet test, 
and this finding may be limited to this level 
of vocabulary knowledge. The lack of rela- 
tions reinforces, however, the implication 
drawn from other data that the description 
of "highly verbal" fails to have its expected 
connotations when applied to preschool chil- 
dren's use of language with peers. This 
finding also reinforces the claim that the 
classifications of preschool children's use of 
language and hostility are “new” variables. 

Aggression scores on the doll play type 
test of frustration did not relate to the play 
Observation measures, with one sex group 
exception, as is shown in Table 12. The fre- 
quency and intensity of aggression during 
this test, then, failed to relate to the hos- 
tility shown by. the children to preschool 
peers. The implication of these findings is 
that the measures of hostility to peers and 
the aggression scores on the test were esti- 


mates of two unrelated and dissimilar char- 
acteristics of preschool children. 

Sociometric scores did not relate to either 
vocabulary scores or test aggression scores. 
Average 775 of girls and boys between socio- 
metric and vocabulary scores were 26 and 
01, respectively, and were —.08 and —.03, 
respectively, between sociometric and test 
aggression scores. 


Aggression, Dominance, and Submission 


Scores 


The dramatic play and reality classifica- 
tions of children's use of language and hos- 
tility have been reported to relate differently 
to age, extent of social participation, and 
the social acceptance of these preschool 
children. Either no relations or negative 
relations were found between these two 


- classifications. Measures from both classi- 


fications were combined additively to obtain 
aggression, dominance, and submission 
scores, as iS listed after the names of the 
scores in Table 13. Relations for the dra- 
matic play and reality classifications indicate 
that two different variables were combined 
in each of these scores. ]t could be pre- 


TABLE 13 


AVERAGE CORRELATIONS BETWEEN SOCIOMETRIC 
SCORES AND AGGRESSION, DOMINANCE, AND 
SUBMISSION SCORES FOR Sex GROUPS 


Sociometric 
scores 
Combined observation measures ||———————— 
Girls | Boys 
Positive aggression (sum of dramatic | .26 .46** 
р!ау suggestion and reality sug- 
gestion) 
Aggression (sum of positive aggres- .20 .53^ 
sion and dramatic play hostility 
and reality hostility) 
Submission (sum of dramatic play .42* | .25 
imitation and agreement and re- 
ality imitation and agreement) 
Dominance (aggression minus sub- | .07 21 
mission) 


* Significant at .05 level. 
** Significant at .01 level. 
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dicted, then, that the relative proportions 
and variability of the specific dramatic play 
and reality measures in each combined Score 
would determine the relations between the 
combined scores and other measures. 

The correlations between the combined 
Scores and sociometric scores are presented 
in Table 13. These can be compared with 
the correlations between sociometric Scores 
and the use of language and hostility com- 
ponents of the combined scores presented 
in Table 12. When this comparison is made, 
the importance of the components for the 
relations of each combined scores appears 
to be as follows: 

1. Positive aggression : dramatic play sug- 
gestion appears to have influenced this y for 
boys, while for girls, this r is about midway 
between those obtained for dramatic play 
Suggestion and reality suggestion. 

2. Aggression: the r’s for dramatic play 
hostility (the most variable component) 
resemble these r's more closely than those 
for the other three components. 

3. Submission: the r’s for dramatic play 
agreement (the most variable component ) 
are higher, but resemble these rs more 
closely than those for reality agreement. 

4. Dominance: these scores do not relate 
to social acceptance of either sex group, and 
the r’s do not appear to resemble any ;'s 
for the components, 

The results of this comparison agree with 
the prediction in the preceding paragraph: 
correlations for the combined Scores depend 
on the relative Proportions and variability 
of the dramatic play and reality components. 

In this instance, it was possible to esti- 
mate the importance of dramatic play and 
reality measures in relations obtained for 
the combined scores. This is not possible 
for scores used in past investigations that 
are similar in definition to the aggression, 
dominance, and submission scores of the 
present investigation, The present findings 
suggest that such studies should be repeated 
With a separation of the dramatic play and 
reality components. 

The relations for the combined scores and 
their components indicate that the combined 
Scores are less useful than the parts in 
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furnishing knowledge of children's be- 
havior. Hence, further analyses were not 
made of relations between aggression, domi- 
nance, and submission, and the other vari- 
ables of this investigation, 


Summary 


Measures in the dramatic play and reality 
classifications of children's use of language 
and hostility with peers did not relate in the 
same way to other measures of children's 
Social behavior. In general, increasing use 
of dramatic play language and hostility 
accompanied an increasing number of 
friendly interactions with other children, 
and more social acceptance in the preschool 
group, while the frequency of use of reality 
language and hostility failed to relate to 
these measures of social behavior. Lan- 
guage and hostility measures within each 
classification were positively related. Either 
no relations or negative relations were found 
between these two classifications of use of 
language and hostility. These results sup- 
port the conception of the two classifications 
as different or opposite variables. They in- 
dicate, also, that not all use of language and 
hostility with peers can be described as 
important in influencing and adjusting to 
peers in preschool groups. 

Sex differences in relations seldom were 
found in relations for use of language or 
for friendly interactions, Boys apparently 
had to display hostility during dramatic 
Play, as well as friendly behavior, to be 
popular. For girls, social acceptance related 
to use of dramatic play language and num- 
ber of friendly interactions, and was irre- 
spective of observed hostility to peers. For 
both sexes, hostile behavior to peers in- 
creased as friendly behavior toward peers 
increased. 

Very few age differences in relations were 
found among the relations between meas- 
ures of social behavior with peers for these 
children aged 23-61 years. 

The children's vocabulary age on the 
Stanford-Binet Vocabulary test failed to 
relate to measures of use of language and 
hostility with peers. These resülts support 
the idea that the dramatic play and reality 
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| classifications of use of language and hos- 


tility are new variables. 

The frequency and intensity of aggression 
during a frustration test failed to relate to 
any measure of observed hostility to peers, 
as well as to friendly behavior. These find- 
ings suggest that the test and observation 
measures studied different characteristics 
or variables. 

Relations for the aggression, dominance, 
and submission scores that combined meas- 
ures from both the dramatic play and reality 
classifications were compared with relations 
for measures in the dramatic play and 
reality classification. This comparison indi- 
cated that the combined scores were less 
useful than the language and hostility meas- 
ures of the two classifications in furnishing 
knowledge of children's behavior. 

Relations between categories of use of 
language were about the same during asso- 
ciation, friendly approach, and conversation 
interactions. 


Relation of Home Experiences to Children’s 
Social Behavior with Age Peers in 
Play in Preschool Groups 


No one doubts that home experiences are 
major determinants of the behavior of chil- 
dren away from home. Nevertheless, ге- 
search evidence suggesting how or what 
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home experiences may affect such behavior 
is scanty. This section describes an explora- 
tion of relations between home experience 
variables known to relate to language devel- 
opment and the measures of child behavior 
away from home that have been described 
in the two preceding sections. 


Home Experiences with Dramatic Play 
Topics 


The wide range of experience known to 
foster language development of children was 
limited in this investigation to experiences 
that might have furnished information 
about the dramatic play topics of the child’s 
preschool group. Both parents were asked 
to check any of nine possible home experi- 
ences through which the child had obtained 
information about each of the dramatic play 
topics of his preschool group. The per- 
centage of dramatic play topics checked by 
parents for the eight frequent types of 
home experience are presented in Table 14. 
These percentages suggest that the child's 
information about the dramatic play topics 
was obtained principally through talk with 
the father, talk with the mother, talk with 
children at home, personal experience, and 
viewing television. Sources of information 
used much less frequently were talk with 
adults other than parents, books and stories, 
and story or music records. 


TABLE 14 


MEAN AND STANDARD DEVIATION OF PERCENTAGE OF Dramatic Р! 
Girts AND Boys, AND SIGNIFICANCE OF 


Елсн Номе SOURCE OF INFORMATION OF 


LAY Topics CHECKED BY PARENTS FOR 
THE SEX DIFFERENCES 


Girls Boys 
ГЕ. сш ешш сызык Significance of 
Source of information at home sex difference 
z N* M SD N* M SD 
О E 
Talk with father 46 43 22 55 58 22 .005 
Talk with mother 45 44 25 55 61 20 .001 
Talk with other adults 29 23 16 54 35 25 .025 
Talk with children 41 46 23 55 53 26 ns 
Personal experience 4s | 40 | 17 | 55 50 18 .005 
Books and stories ar Ister meee 23 .001 
Television 42 43 24 51 59 22 .001 
Story or music records 21 19 10 43 16 10 ns 
Four or more sources checked 38 29 18 53 46 25 .001 
whom the source of information was given any checks. i 


*N = number for 
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Boys, unquestionably, had more oppor- 
tunities at home to learn about dramatic 
play topics than girls. The percentages of 
boys significantly exceeded those of girls for 
seven of the nine types of experience in 
analysis of variance tests, as is shown in 
Table 14. Boys had as many opportunities 
as girls to learn about home and family 
situations, and had more learning experi- 
ences about other topics and situations than 
girls. For example, few parents talked 
about cowboys, guns, destructive activities, 
and construction work with girls. 

Television was the only home source of 
information about dramatic play topics to 
be checked more frequently as the age of 
the child increased. The percentage of 
topics checked for other sources did not 
change with age, according to analysis of 
variance tests. 

The percentages in Table 14 indicate that 
many children lacked information about 
specific topics, or equal opportunity to learn 
about the topics through several sources of 
information, or both possibilities. Two 
analyses were performed to study these pos- 
sibilities. The data from an analysis of the 
number of sources checked for each topic 
are shown in the last line of Table 14. 
Results of а correlation analysis between 
percentages checked for six sources of in- 
formation are shown in Table 15. Results 


of both analyses indicate that children given 
information about more dramatic play topies 
through one home experience tended to have 
been given information about more topics 
through other home experiences. In the 
opposite terminology appropriate for almost 
as many children, children who had not been 
given information about many dramatic play 
topics through one home experience prob- 
ably did not have an opportunity to learn 
about the topics through other home sources 
of information. 

The average r's in Table 15 suggest, in- 
correctly, that positive relations between 
percentages parents checked for each source 
of information are higher for girls than for 
boys. Significant sex differences in these 7's 
were limited to the two shown in Table 15. 
The somewhat smaller average r’s for boys 
include the significant differences in r’s for 
44-54 year boys from those obtained for 
boys of other ages that are shown in Table 
16. The percentage of dramatic play topics 
checked by parents of 44-51 year boys for 
one home source of information often failed 
to relate to the percentage of topics these 
parents checked for other sources. 


Relations for Home Experiences with Dra- 
matic Play Topics 


Children’s Use of Language and Hos- 
tility with Peers. An overall generalization 


TABLE 15 


AVERAGE CORRELATIONS BETWEEN PERCENTAGES OF Dramatic PLAY Topics CHECKED FOR 
Six Home Sources or INFORMATION OF GIRLS AND Boys 


Home source of Talk with Talk with Talk with Personal Books and 
information father mother children experience stories 

Talk with mother -41* 
.35* 

Talk with children .45* -62** 
13 .34 

Personal experience .45* .62** E 
1 E 36% T 

Books and stories .07= .62** .35 S74 

B 737 #8 .50** .41** .41** 

Television -05° -43* .41* .36 .34 

4440 16 .52+ 124 > 30» 


Note.—Boys' average 7’s in italics, 
2 Sex difference significant at .05 level, 
* Significant at .05 level, 

** Significant at .01 level, 
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TABLE 16 


DIFFERENCES BETWEEN 415-55 YEAR Boys AND 


Boys oF OTHER 
PERCENTAGES OF Dramatic PLAY IDEAS CHECKED 
By PARENTS FOR Номе SOURCES 


Average| poft 
Correlations between | 7 for rfor | test of 
home sources о! 414-526 | other differ- 
information years ages ence 
Talk with father and | — .04 a55e* 05 
books and stories 
Talk with mother and | — +35 .41* .01 
‘television 
Talk with mother and | — ‚07 isie .05 
talk with children 
Books апа storiesand | — .09 | .59* .05 
television 


ж Significant at .05 level. 
жж Significant at .01 level. 


about the correlations shown in Table 17 
between extent of information the children 
gained at home about the dramatic play 
topics of the preschool group and their use 
_ of language and hostility during preschool 
play is as follows: As the child's home 
experiences with the dramatic play topics 
of the preschool group increased, the child's 
use of dramatic play language and hostility 
increased, but use of reality language and 
hostility either decreased or was not affected 
by these home experiences. These relations 
furnish additional evidence of a distinctly 
different meaning for children’s use of dra- 
matic play language and hostility and chil- 
.dren's use of reality language and hostility 
in talking with age peers. ` 
__ The home experiences that had the largest 
number of significant relations with use of 
dramatic play language and hostility are the 
first three listed in Table 17: talk with 
‚ father, talk with mother, and talk with other 
adults. Inspection of the size of correlations 
for all home sources of information sug- 
gests that children learn more through talk- 
ing with loved adults than through such 
media as books and television. The standard 
of learning in this instance is use of knowl- 
edge during children's uncertain, beginning 
: attempts to play "with age peers. - 
à È 


+ 
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These data, with those of the preceding 
section, disclose а possible way for parents 
to help their child in social adjustments at 
preschool. They suggest that if parents and 
adults talk with the preschool child about 
more of the topics the child can use in play 
with other children, the child talks about 
and plays these topics more frequently with 
preschool peers, and has a better chance of 
social acceptance in the preschool group. 
These generalizations do not, take account 
of the seven significant sex differences in 
relations for the dramatic play classification 
shown in Table 17. These sex differencés 
in relations indicate that the parts of the 
generalizations concerned with use of dra- 
matic play language and hostility applied to 
girls, but not to boys. Girls who more fre- 
quently used dramatic play language and 
hostility had opportunities to learn about 
more dramatic play topics from personal 
experience and talk with parents and other 
adults than girls using dramatic play lan- 
guage and hostility less frequently with 
peers ; these correlations for girls are large 
and significant. For boys, these correlations 
approach zero in size or are negative in 
direction. im ] 
The boys used dramatic play language 
and hostility more frequently than girls, and 
had the highest mean level of social inter- 
action that has been reported for boys and 
girls in two studies in two different states 


(see Table 6). For these boys, then, the 
distributions of measures of use of dramatic 
of social 


play language and hostility and 
interaction scores may have been concen- 
trated at very high levels. There may. have 
been very few boys in this sample drawn 
from the middle and. lower. range of fre- 
quency of use of dramatic play language 
and hostility and of social interaction pos- 
sible for all preschool. children. If. so,: the 
distribution within this sample. may have 
negated relations . demonstrable within 
samples from lower ranges Of representing 
the total population. - ; Е 
Воуѕ һай learned about.mo 
matic play topics from, more. home sources 
of information than girls. The standard 
deviations for girls around their lower mean 
percentages of.dramatic play topics checked 


re-of the dra- 
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TABLE 17 


Dramatic play Reality 
Home source of | 1 
information Sugges- | Agree- Hos- |Sugges- Agree- Hos- Greet- Ques- 
tion ment tility tion ment tility ing tion 
Talk with father 44+» 45 ELI 248 п .39* 43 —.01 
=:01% 29 03 1198. 16 —.22* “25 —.04 
Talk with mother Agee 22 43» —.06 —.24^ 48 —.07 —A1* 
—.25* -08 —.25^ 48 .27* —.12 —.10 3:20. 
Talk with other adults 52598 Ies ST A48 12 — — " 
00" .00* —.20* .08 BEI EDS = ыч, 
Talk with children 23 07 .20 14 .09 -07 —.16 2:17. 
17 AS —.05 08 | —.08 —.25 —.28 —.03 
Personal experience 41* Adan 18 19 | —.34 16 —.05 —.15 
21 —.03* 16 .04 11 .05 —.09 03 
Books and stories 16 :30 .00 —32 | —.50*« | _ 0) —.38* | —„біжжа 
12 16 —.07 09 126 —.08 „30* | —.10 
Television 24 48 —.01 —43 | —.08 —.28 25 —.24 
16 .22 18 .05 —.10 +21 2 06 
Story or music records ES 22 01 36 | —.05 — — LE 
.20 =.27 .04 21 -06 I, ET = 
Four or more sources 41* Al* 27 10 | —.04 aei Wd zn 
checked 23 06 Lf 07 10 = = pu 


Note,— Boys" average r's in italics, 
$ Sex difference is significant at .05 level. 
* Significant at .05 level, 
** Significant at .01 level, 


for home information Sources, shown in 
Table 14, are about as large as those for 
boys around their higher mean percentages. 

ese data mean that the distributions of 
parent behavior were not the same within 
the samples of boys and girls. Tt is possible 
that the parent behavior distribution: for 
girls was within the Tange permitting dem- 
onstration of these relations, and that the 
parent behavior distribution for boys was 
not within this range. 

Either explanation of the sex differences 
in relations between the dramatic play clas- 
sification and оте home information 
Sources is compatible with the sex simi- 
lárities in Corresponding relations for other 


nificant 7’s obtained for girls in correspond- 
ing relations with talk with parents and talk 
with other adults, 

Sex differences in relations for the dra- 
matic play classification may be due, then, 
to a negation of relations for boys by an 
unusual sample of boys and their parents, 

he general statements of these findings 
may not need to take account of these sex 
differences, 

The lack of or negative relations for use 
of reality language and hostility is not a 


Dramatic play 
been described as including most aspects of 
life within the cognizance of young children. 

еге is one exception in Table 17 to the 
finding of no relations or of negative 'rela- 
tions between children’s use of reality lan- 
guage and hostility and home experiences 
with dramatic Play topics. This is the posi- 
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© tive relation between use of reality hostility * 


by girls and information gained through 
talk with father. This relation is in line 
with findings reported earlier that girls’ use 


© of reality hostility often resembled their use 


of dramatic play language and hostility in 
relations with other measures. 

No significant 7's are listed in Table 17 
between the dramatic play classification and 
three home sources of information: talk 
with children, watching television, and 
listening to story and music records. How- 
ever, positive relations, similar to those for 
other home sources, were found within age 
groups for the two sources of television 
and story and music records. The average 
r for 44-54 year boys and girls between tele- 
vision and use of dramatic play hostility was 
a significant .45 that differed at the .05 level 
from the —.06 obtained for other age and 
sex groups. The negative relation between 
use of reality hostility and television is also 
їп line with the general findings; this aver- 
age r for all age and sex groups was à 
significant —.24. In the relations between 
story and music records and use of dramatic 
play suggestion, the average 7 of 44 ob- 
tained for boys older than 4 years differed 
at the .05 level from that of — 25 for boys 
younger than 44 years. For all groups of 
girls and the two older groups of boys, the 
average y between these meastires was 43 
and significant. 

The.-first significant relation to. be re- 
ported between frequency of use of reality 
greeting.and other measures of this investi- 
gation.. occurred for all children for the 
information source of talk with other chil- 
dren at home and is not shown in Table 17. 
This source of information failed to relate 
otherwise to children's use of language an 
hostility. Children's use of reality language 
to greet or welcome other children an 
adults at preschool was less frequent when 
they talked with children at home about 
more dramatic play topics of the preschool. 
A significant average ” of —25 was obtained 
for all children between these measures. 

Average 7’s between home experiences 
with dramatie play topics and both associa- 
tion use of language and hostility and con- 
versation use of language and hostility re- 
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sembled those presented in Table 17 for 
friendly approach use of language and hos- 
tility, but appeared to be slightly smaller 
in size. 

Social Interactions with Peers. The rela- 
tions shown in Table 18 between home 
experiences with dramatic play topics and 
number of social interactions with peers 
appear to be a replication on a lesser scale 
of the relations just described for use of 
dramatic play language and hostility with 
peers. General description of these rela- 
tions, then, includes the following state- 
ments discussed more fully in preceding 
paragraphs. 

1. The number of social interactions with 
peers increased as children had home ex- 
perience with more of the dramatic play 
topics of the preschool group. 

2. These relations were more often sig- 
nificant for girls than for boys. An unusual 
sample of boys and their parents may have 
resulted in the sex differences in relations 
shown in Table 18. ! 

3. The home sources of information re- 
lating more closely to the number of social 
interactions with peers were those in which 
children talked about dramatic play topics 
with adults important to them. 

Another instance of similarity in relations 
for girls’ use of reality hostility and use of 
the dramatic play classification is shown in 
Table 18. The number of girls’ hostile inter- 


data suggesting that 
mation was obtained 


of conversation interactions 
44-54 year boys; the average 7 
other ages was 08. 

One relation shown in Table 18 may have 
a different meaning than the general find- 
ings. Boys had more conversation inter- 


for boys of 
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TABLE 18 


AVERAGE CORRELATIONS BETWEEN PERCENTAGES oF DRAMATIC Pray Topics CHECKED py PARENTS 
For Home Sources or INFORMATION AND PEER SOCIAL INTERACTION Scores For GrRLs AND Boys 


Peer social interaction scores 


Home source of | 
information Friendly АП friendly i 
Association approach Conversation (A 4- FA +C) Hostile Ї 
Talk with father .25 .37* :31 .34 - 
.23 47 —.04 ‚20 + 
Talk with mother .22 .39* .22 .32 — 
—.02 —.04 —.09 —.03 — 
Talk with other adults .36 .39xa. 
-10 —.28^ 
Talk with children .06 .25 113 Ei — EN 
.08 .44 .J6* 17 = | 
Personal experience E Tw .35 .50** — 
.02 .09 .19 .12 gon 
Books and stories 31 229 .02 .27 EC 
] 19 —.03 .03 12 — 
Television = .06 01 14 ‚00 = 
-14 AS .38* .20 = | 
Four ог more sources checked .36 .32 
- ay = .22 —.16 


Note.— Boys' average r's in italics. 

2 Sex difference in 7's is significant at .05 level. 

* Significant at .05 level, |; 
** Significant at 01 level. -— 


actions with other children when parents of dramatic play language and hostility and | 
checked more dramatic play topics for the social interaction Scores. 

information source of talk With other chil- трее data assist in the interpretation of | 
dren. This source of information failed to already reported findings. Home experi- 
relate to use of dramatic play language and ences with dramatic play topics were asso- 
hostility, as 15 shown in Table 17. Perhaps ciated with differences in two classifications y 
practice in talking with some children about of children’s behavior with preschool peers 

topics of interest increases the child's ability that accompanied differences in social accept- 


to converse with other children from half ance in the preschool group: frequency of 


а шшще or longer, use of dramatic play language and hostility | 
Sociometric, V. ocabulary, and Aggression and number of friendly interactions with == 
Scores. Most average r’s, shown in Table Peers. Home experiences with dramatic play T 

19, between home experiences with dramatic topics have the positive relations with socio- 

Play topics and Sociometric scores are not metric scores that could be predicted from | 
large enough for significance. However, the findings of the Preceding sentence. The 
three positive relations were significant obtained r’s are not large enough to indi- i 
when sex STOUDS were combined : the aver- cate that these relations are important in | 
age r’s for all children between sociometric themselves and independent of associated 
Scores and talk with father (.26), talk with variables. These relations, then, suggest that 
children ( .28), and four or more Sources when adults make the effort to talk with the 
checked (.25). In direction and relative child about topics the child can use in play 
size, the r's for sociometric scores resemble with other children, and when they provide $ 
the 7’s listed in Tables 17 and 18 for use varied experiences with these topics for the 
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child, the child is helped in ability to talk 
about and act out these topics in play with 
other children, and, indirectly, through this 
ability, to gain social acceptance in the group 
of children. 

Relations between home experiences with 
dramatic play topics and vocabulary test 
scores of these children were not à replica- 
tion of the positive relations found in earlier 
studies (e.g, Dawe, 1942). Relations for 
boys were in the predicted positive direction, 
but all relations for girls were negative, 
although not significant, as is shown in 
Table 19. The sources of home information 
relating significantly to vocabulary age of 
boys were those less closely associated with 
children’s use of dramatic play language and 
hostility with peers: books and stories, tele- 
vision, and four or more sources of infor- 
mation checked. Information obtained about 
the children and their families is not suffi- 


TABLE 19 


AVERAGE CORRELATIONS BETWEEN PERCENTAGES OF 

Dramatic PLAY TOPICS CHECKED BY PARENTS FOR 

Home SOURCES OF INFORMATION AND TEST SCORES 
or GIRLS AND Boys 


Test scores 
(ene RE ie 
Home source of 
information Socio- | Vocabu- | Aggres- 
metric | lary sion 
Talk with father 35 | —-14 31 
.20 26 —.01 
Talk with mother 46 | —-12 —.25 
18 .02 08 
Talk with other adults 06 |=. 02 
—.01 .20 00 
Talk with children 23 | —.25* —.12 
.33* .26* 16 
Personal experience 35 | —-15 = 
.02 27 — 
Books and stories 45 | —.04 — 
.25 .38* == 
Television 42 | —.06 € 
.29 .42** m 
Story and music records 109 | —.30 = 
09 A7 к 
Four ог more sources 26 | —.31* =e 
checked 22 .52*** re 


Note.—Boys' average r'8 in italics. 

a Sex difference in relations is significant at „05 level. 
* Significant at .05 level. 

** Significant at .01 level. 


cient to explain these sex differences in rela- 
tions. These differences are compatible with 
the findings of no relation between vocabu- 
lary age and use of language and hostility 
with peers, and of sex differences in rela- 
tions between use of language and hostility 
and home experience with dramatic play 
topics. 

Test aggression scores did not relate to 
the percentage of dramatic play topics 
checked for the first four home sources of 
information listed in Table 19. Relations 
were not determined for the other informa- 
tion sources. 


Time for Stories, Records, and Television 


The time parents reported that children 
listened to stories or to records did not 
change with the age of the child or differ 
with sex, according to analysis of variance 
tests. All children listened to stories for a 
mean of 23.14 minutes daily, and the stand- 
ard deviation was 12.85 minutes. For the 
92 children who listened to records at home, 
the mean time рег day was 20.20 minutes, 
and the standard deviation was 19.13 
minutes. 

Daily time children spent watching tele- 
vision increased about a half hour with 
each year of age, as is shown in Figure 9; 
The four children who did not watch tele- 
vision were split equally between the two 
younger age groups. Time watching tele- 
vision did not differ with sex in analysis 
of variance tests. The standard deviations 
for each year group ranged from 30.07 to 
34.44 minutes. Parents reported that the 
programs watched most frequently were 
Captain Kangaroo от Romper Room during 
the time between breakfast and leaving for 
preschool ; children's programs scheduled 
from 4:00 to 6:00 рм; family, western, 
adventure, and musical programs from 6:30 
to 8:00 pM; and 4 hours of programs on 
Saturday morning, such as Superman, 
Mighty Mouse, Fury, etc. 


Relations for Time of Stories, Records, and 
Television 


The time children spent listening to 
stories OT records and watching television 
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AGE GROUP 
Fic. 9. Mean daily minutes children in each age 


group watched television, 


at home related only Sporadically to their 
preschool and test behavior. The amount 
Of time spent at each activity correlated 
Significantly, but not closely (average y's 
» with the Percentage of 


The significant relations for time spent 
at these activities were the following : 

1. Time for Stories had an average r of 
35 with the frequency of use Of reality 
hostility by boys; for girls, this 7 was .02. 

2. Time for records correlated positively 
with girls" use of dramatic Play suggestion 
(.39) and dramatic play hostility i 
The r for dramatic play hostility differed 
at the .05 level from that of —.01 for boys. 


3. Time for records*had an average rgf 
25' with number of conversation: intei- 
actions in all age and sex groups, nh 

4. Time for television had an average у 
of .60 with girls’ use of reality greeting 
that differed at the .05 level from the 113 
obtained for boys. 

5. Time for television had an average 
r of .28 with vocabulary age in all age and 
sex groups. 


Time Spent Talking with Family Members 
and Maids 


The children of this study were “privi: 
leged” in the sense of much time devoted 


change with age or differ with sex jn 
analysis of variance tests. 


get dressed, but it necessarily 
i Five hours a 


TABLE 20 


Hours PER WEEK CHILDREN SPENT IN TALK 
WITH PEOPLE AT Номе 


N ans 
Family member | chil- Mean| SD Range 
dren | hours 
Father ^ 101 | 18.69 | 10.00 | 2-44 
Mother į 100 | 37.30 | 14.30'| 12—75 
Siblings: 
OF 434-635 year >|. 56 27.08 | ` 7.24 |3.5256 
groups iss 
Of 216-416 year 38 | 3.83} 2.22 | 3.5-63 
groups s ge a 
Maid or baby sitter | аз ' 842-782. 


Ex 
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The fathers actively participated in the 
home guidance of the children; their mean 
time of talking with the child was slightly 
more than half the mean time so spent by 
mothers. It was distributed as about 2 hours 
each week day, and about 7 to 10 hours on 
weekends. They expressed opinions during 
the interview that sharing activities with 
children was important for the child's devel- 
opment. Fathers devoting fewer than 5 
hours a week to talk with their children 
were in specialized professions, such as 
psychiatry, brain surgery, and obstetrics. 
For the group as a whole, however, years 
of father's education did not relate to the 
time spent talking with the child. 

Maids and baby sitters had an average of 
slightly more than an hour а day of talk 
with the child in the 8196 of the homes 
using their services. This talking time was 
described usually as interactions necessarily 
occurring in the small space of houses, oF 
as due to the child’s interest in the maid's 
activities. 

Children older than 4 years talked about 
seven times as long with their brothers and 
sisters as children younger than 44 years. 
This time difference, shown in Table 20, 
was not analyzed for age of siblings. It 
may indicate that siblings of younger chil- 
dren were too young to talk, as well as а 
lack of ability in these subjects to share 
activities with older siblings. 

Some interrelations between the time 
spent talking with individuals at home sup- 
port the idea that a short time talking with 
one family member was balanced by а 
longer time talking with other family mem- 
bers, as is shown in Table 21. When the 
children talked. more with siblings, they 
talked less with mothers and maids, and 
vice versa. Other relations do not support 
this idea. Time spent in talk with father 
did not relate to time spent in talk with 
other family members, and time spent talk- 
ing with mothers correlated positively with 
time spent talking with maids. 


Relations for. Time Spent Talking with Indi- 
viduals at Home * f 


Measures of Child B ehavior: Тасгеаѕеѕ in 
the time children spent talking with family 


TABLE 21 


INTERRELATIONS BETWEEN THE Hours THAT 
FAMILY MEMBERS AND Mas TALKED WITH 
THE CHILDREN AS SHOWN IN AVERAGE 
CORRELATIONS FOR ALL AGE 

AND SEX GROUPS ^^ ' 


Time of family Father's | Mother's: Siblings’ 


member time time | time 
Mother's time „15 
Siblings' time —.05 | —.30* uses 
Maid’s time —.10 31% = .22 


ж Significant at .05 level. 


members tended to accompany decreases in 
frequency of talk with peers and in number 
of social interactions with peers at, pre- 
school, according to the data presented in 
Table 22. This interpretation is based оп 
consistency of the negative direction. of 
relations; few rs for these relations are 
large enough for significance. 

In contrast, increases in the time children 
spent talking with maids or baby sitters 
accompanied increases in children’s use of 
reality language, ot of frequency of talk as 
themselves with peers. These r's are large 
enough for significance, as is shown in 
Table 22. Positive relations for use of 
reality language with. preschool peers haye 
been rare in the data reported so far. 
Hence, these relations are not easily inter- 
preted. 

More talk with the maid at home appar- 
ently fostered the child's freqüent use of 
a behavior with peetsithat is not essential 
їп getting along with peers: Concurrent 
with the increase in children's"use of-reality 
language, behavior important torthe: child's 
acceptance by preschool peers decreasedas 
children spent more time talking with the 
maid at home. The positive 7’s for use of 
reality language differed at the: 01 level 
from the significant average rs ot all age 
and sex groups between maid’s time talking 
and use of dramatic play suggestion (-:27 » 
and dramatic play agreement (=.26):! Time 
spent talking with . the maid“ cai dipe 
described аз contributing -to -thé 
adjustment at preschool: + f 


| 
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TABLE 22 


Observation measure Father's time Mother's time Siblings’ time Maid's time 
Dramatic play language: 
Suggestion E —.14 —.04* —.40*« 
25504. E05) 028 TW) 
Agreement ES EI —.13 —.06 —.44*» 
2:29 Вр we ee "dA 
Hostility cA .16 20 .04 
EI —.10 —.03 —.07 
Reality language: 
Suggestion —.18 —.04 .18 .19 
= 32* my f .33* 
Agreement —.1$ = -00 ‚45+** 
11 12 —.20 «46% 
Hostility +13 34% —.28 09 
16 26 =.26 —.08 
Greeting — .34* —.06 .20 31 
—.29 06 =15 18 
Question —.02 213 .19 35 
—.14 20 14 30 
Social interaction: 
Association —.17 —.05 .03^ ie 
03 —.20 —.08* 7168 
Friendly approach -—.13 —.10 .09 —.14 
—.09 .01 a —.07 
Conversation —.22 .15 .06 .05 
—.20 .01 —.08 —.03 
All friendly (A + FA +C) —.12 10 125 512 
—.05 -10 —. 14s 19 
Hostile ыы] .15 —.11 —.18 
12 27 —.32* 07 


Note.—Boys' average r's in italics, 
a erences in relations are significant at .05 level, 
* Significant at .05 level. 

** Significant at +01 level, 


€ average amount of 


reward value for the child 
їп talking with adults that inhibits the 
child's development of reward value for 
For example, excessive 


In these relations, 
negative for children younger than 44 years, 


positive for children older than 44 
years. Моге time spent talking with siblings 
accompanied less frequent dramatic play 
suggestion and fewer friendly interactions 
at preschool for children younger than 44 
years, but accompanied more frequent dra- 
matic play suggestions and more friendly 
interactions at preschool for children older 
than 44 years. 

No average r’s were significant between 
time spent talking with persons at home and 
children’s scores on the sociometric and 
vocabulary tests. Corresponding correla- 
tions were not computed for scores on the 
test of aggression. 

Measures of Home Experiences. There 
was little evidence of relations between the 
time family members and maids spent talk- 
ing with children and home experiences 
with the dramatic play topics of the pre- 
school group. The only significant 7 was a 
positive 37 obtained for boys between time 
spent talking with father and the percentage 
of dramatic play topics checked for the 
information source of talk with father. 
Most correlations between time spent talk- 
ing with family members and percentages 
checked for home information sources were 
close to zero in size and were inconsistent 
in direction. Correlations for time spent 
talking with the maid were small in size also, 
but had a consistent negative direction. 

These data indicate that the time spent 
talking with family members and maids is 
independent of the child’s home experiences 
with the dramatic play topics of the pre- 
school group. They support the findings 
that time spent talking with persons at 
home relates negatively, if at all, to chil- 
dren’s use of dramatic play language and 
hostility with peers, and to the number of 
social interactions with peers at preschool. 

The time children spent listening to 
stories and records and watching television 
was not associated with the time spent talk- 
ing with family members or maids. 


Education of Parents 


Years of education of the mother and 
father did not vary with the age and sex of 
the child, or with the preschool group, in 
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analysis of variance tests. Mean years of 
education of fathers were 16.74, or one 
year beyond a bachelors degree; the stand- 
ard deviation was 2.77 years, and the range 
was from 12 to 23 years. Mean years of 
education of mothers were 15.15, or one 
year less than a bachelors degree; the 
standard deviation was 1.56 years, and the 
range was from 12 to 18 years. The level 
of education of both parents can be de- 
scribed as high. Variation within both 
groups was limited to the years after gradu- 
ation from high school. These data indicate 
that the attempt to control socioeconomic 
status in this investigation was successful. 


Relations between Education of Parents and 
Measures of Child Behavior and Home Ex- 
periences 


Few differences in child behavior meas- 
ures were associated with differences in 
years of education of parents. Failure to 
find such relations is expected when socio- 
economic status is controlled within one 
level, as in this investigation. The signifi- 
cant exceptions to the general trend were 
the following relations : 

1. Sociometric scores had an average ” of 
25 with father’s education for all age and 
sex groups, but did not relate to mothers’ 
education. 

2. The average 778 were both —.27 be- 
tween years of education of the mother and 
f reality lan- 
guage to greet preschool peers and adults. 

3. Frequency of dramatic play hostility 
of girls increased as mothers’ education 
increased (.26), while the opposite relation 
was found for boys (—.28). 

Education of mothers, but not of fathers, 
related positively to the percentage of dra- 
matic play topics parents checked for three 
home sources of information for these chil- 
dren: talk with mother, talk with children, 
and books and stories. These average "s 
for all age and sex groups ranged from 
35 to .39. Mothers’ control of these sources 
of information can be described as influ- 
enced by the extent of their education. 

The time spent listening to stories and 
records did not relate to education of par- 
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ents.. The. time girls Spent watching tele- Home experiences with dramatic play 
vision. at ‘home related Negatively, (=A9) topics related POSitively to the vocabulary 
to.education ofbothparents . : 7. ў age of boys, but these relations Were nega- 
, Si bas sero. ТП n n1 tive for girls, 

M Бэ) "d 7 Thére was some evidence, although not 
demonstration, that as the time children 


Time spent at home in talk with the maid 
ad positive relations with children’s use 
of reality language and negative relations 


Preschool 


as an increase T ` f 3 ane 
Ban y - ew differences in child behavior were 
er of their Social. interactions A ; 


Correlations Were largest. for 


2191 
imi 


tion Sioned ў ; an of 19 hours in talk 
School'£roup, ' it ildren each Week. Stories were 
a mean of 23 
d most children 


test, according: to. the ж 
and Bel] (1955, рр. 3-4; i 
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. 1958, р. 346), includes items in its “patho- 
_ genic” scales that “state attitudes contrary 
to the usually approved child-rearing opin- 
jons.” It was conjectured that parents’ 
scores оп, these scales would relate to chil- 
dren's hostile interactions, and might relate 
to other aspects of children's social behavior. 


Parents Scores on the PARI ` 


Parents": cheeks indicating mild: or strong 
agreement: ;Ors: sagreement with. items: in- 
cluded, in the RARI scales were given point 
. values and summed to furnish scores for 

the single: scales (listed in Table 24): The 
five PARI composite scales, listed:in Table 
23, are the five general attitudes derived by 
Schaefer and Bell from a study of the fac- 
torial „structure ОЁ the PARI. Parents’ 
scores оп the PARI composite scales are 
the sum of their points on the five to. eight 
related : scales included in each composite, 
and listed in the method section.: 

The parents of this investigation’ dis- 
agreed with most pathogenic scales of the 
PARI. Mean scores of parents were on-the 
disagreement side (below)..of the midpoint 
of .all: composite. scales except. that listed 
second, Unhappiness at Home, as is shown 
in Table 23. Mean scores were in the strong 
disagreement range for the Suppression and 
Distance scale, and in-the mild disagreement 
"range: for the remaining three scales: Most 
of these parents, then, agreed’ with “usually 

approved: child-rearirig: opinions.” pasteles 


‚ Scores ef fathers did not differ from 
scores of mothers on any composite scale 
except Unhappiness at Home, according to 
analysis of variance tests. Mothers' agree- 
ment with the statements included under 
Unhappiness at Home was stronger (.001 
level) than that of fathers. Agreement of 
both parents with this scale decreased (.05 
level) as the age of the child increased. 
These results are in line with the common 
sense idea that the restrictions of chil 
rearing are more irksome to parents, par- 
ticularly mothers, when the children are 
younger. This was the only composite scale 
that related to the age of the child. 

Parents of boys disagreed more strongly 
with the Suppression and Distance scale 
(.05 level) and the Overpossessiveness scale 
(.01 level) than parents of girls. These 
findings agree with generally held opinions 
that parents allow boys to have a greater 
freedom of expression than girls, and en- 
courage independence more in boys than in 
girls. Parents’ scores on the other three 
composite scales listed in Table 23 did not 
differ with the sex of the child. 

Agreement between fathers and mothers 
in the same family was not particularly high 
on. either single or composite PARI scales, 
as is shown in Table 24. The size of many 
correlations suggests that the mothers and 
fathers who agreed about these attitudes 
didn’t happen to be married to each other. 
Inspection of all correlations suggests that 
the degree of agreement depended in part 


saben TO aa “STABLE 25, 

MEAN Scores ox.PARI СомрозитЕ SCALES. кок 41 FATHERS AND 46 MOTHERS OF GIRLS AND 

‘OR, 52. FATHERS AND.S4 "MOTHERS or Boys, AND THE STANDARD DEVIATIONS FOR ALL PARENTS | 

‚Сїй л Воуз {Р 
24 SD" of all 
parents 
Mothers : Fathers: -| Mothers 

40.26. 40.13 37.92 8.04 
65.48 61.21 65,61 8.66 
47.80 50.46 48.02 12.25 
47.91 44.65 45.72 6.71 
16.09 76.90 15.42 16.37 


On the attitude tested. More agreement in Children, and with the role of either 
between mothers and fathers was found on Parent in Promoting or discouraging family 
scales emphasizing strong or punitive con- harmony, functions, or activities, 


of children's recognition of parents ee Children’s Display of Hostility to Peers 


TABLE 24 by the data in Table 25. Four of the PARI 
Ркорист-Момкхт CORRELATIONS BETWEEN PARI Composite scores of fathers related to some 
Scores or FATHERS AND PARI Scores ор observed Manifestation of hostility of their 


Moruers ror Girts AND Boys 


41 50 cant relations with the measures of child 


Раг- | par- hostility, 
PARI scale ents | ents 


of ‘of Punitive Control Scales, Boys and girls 
girls | boys Were more hostile at preschool when both 

mothers and fathers agreed More (or dis- | 

Composite scales: agreed less) With items included іп. the | 
Suppression and distance 03 |.40-« Demand for Striving and Harsh Punitive 
nhappiness at home 27 1 .31* Control Scales, This relation was found for 
emand for striving беж Ghee ар three Classifications Of children's hos. 
ын аКШ шул ana ran be described аз the most 
Single scales: х i general of the relations shown in Table 25. 
1. Encouraging verbalization AT | 34% Additional generality is Suggested by the 
5 ашк the will dee similiarity of the scales included in the two 
4. Deification of parents 155» AT» composite scales, Four of the five scales 
5. Suppression of aggression 21 | 29% combined jn the Demand for Striving com- 
6. Equalitarianism у 29 | 31% Posite scale Were included among the eight 
£ AV deal of activ күш ай E scales Combined in the Harsh Punitive Con- 
9. Suppression of ү —04 ч trol Composite scale (Scales 2, 3, 4, and 17 
10. Comradeship апа sharing 33 | 19 of the single Scales listed in Table 24). 
i CE "i AER A ite nee Relations for both Scales, then, indicate that 
13. Expressing love and affection RH 107 the children w O were hostile more fre- 
14. Autonomy 04 |15 quently to their Preschool peers had parents 
15. Intrusiveness 19 |.29« — who favored Punitive contro] in making 
18. Ол ВИ "n 2 emands of their children while children 
18. Fostering dependency 464« | 26 Who showed less hostility to Preschool peers 
19. Irritability 17 [|.89«« Һай Parents who disagreed With ideas of 
СИЕ ЕНИС ЕЕ 
role Эй К ably, favored other methods of guidance. 
22. Considerateness of spouse |—08 | 15 These results SUpport psychological hy- 
23. кыо ОЁ answering 0855.48 Potheses that Punishment leads to aggres- 
24. (Father) Disapproval of as- 2505 Sion. They also Support hypotheses that 
Cendance of mother when an aggression drive isa consequence 
(Mother) Dependence of of parents’ Punitive Control, it will be ex- 
mos Pressed or displaced to Persons other than 

* Significant at 08 leva (or, as well as) parents, These findings are ү 


** Significant at .01 level unusual їп demonstrating a displacement 
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TABLE 25 


AVERAGE CORRELATIONS BETWEEN PARENTS' SCORES 


ox PARI Composite SCALES AND 


OBSERVATION MEASURES OF CHILDREN’S HOSTILITY TO PEERS 
Use of dramatic Number of hostile 
play hostility Use of reality hostility interaction 
PARI composite scale Boys 
Vet lnc eee boar 
Girls Boys Girls Girls Boys 
(АШ | (4346-578 
but 414) | years) 
* Scores of fathers: 
Suppression and distance —.27 .06 = .12 .29 —.25 —.16 04 
Unhappiness at home .08 .18 11 .64** 18 10 04 
Demand for striving 19 .01 .34* ,69** 13 .20 A0** 
Overpossessiveness —.50** | —.34* 14 .25 —.39 | —.09 —.50** 
Harsh punitive control .04 .34* S .65** .04 .22 .33* 
Scores of mothers: 
Suppression and distance —.14 .19 11 .08 —.28 —.02 .10 
Unhappiness at home .08 .06 .19 19 21 17 .08 
Demand for striving .16 —.03 .25 E -.22 41» ‚18 
Overpossessiveness —.03 —.16 —.01 .04 —.21 14 —.03 
Harsh punitive control .20 .04 JT .35* .09 .45** .26 


* Significant at .05 level. 
** Significant at .01 level. 


to age peers in preschool play. This specific 
displacement was not found by Sears et al. 
(1953) in the study most comparable to 
the present investigation. In that study, 
ratings of the punitiveness of 40 mothers 
did not relate significantly to ratings of their 
children's aggression at preschool. Sears 
et al. hypothesized that their child subjects 
were too anxious about being punished for 
aggression to peers to be able to make this 
displacement of the aggression drive. By 
that hypothesis, the children of the present 
investigation were not afraid to be hostile 
to peers. It is generally believed that 
nursery school procedures have shifted in 
the direction of less teacher interference in 
children's conflicts during the years since 
the Sears et al. data were collected. (1948- 
49). This may have reduced fear of punish- 
ment for hostility to peers and may explain 
part of the difference in results. Probably 
more important in explanation of the con- 
tradictory results, however, are (a) the 
differences in measures of both parent and 
child behavior, particularly the separation 


of child hostility into the dramatic play and 
reality classification of the present investiga- 
tion; (b) the wider child age range of the 
present investigation ; and (c) the inclu- 
sion of behavior of fathers, as well as of 
mothers, in the present investigation. Rea- 
sons for considering these aspects to be 
more important are the findings reported 
in the next few paragraphs. 

Children's use of hostility in their own 
behalf had more and closer relations with 
the punitive control scores of parents than 
their use of hostility to carry out the roles 
of dramatic play Ог the number of their 
hostile interactions with peers. Six of the 
eight correlations between the two punitive 
control scales and use of reality hostility 
were significant, while only one correlation 
for dramatic play hostility was large enough 
for significance, as is shown in Table 25. 
Correlations between punitive control scores 
of parents and the number of children’s 
hostile interactions with peers are those 
that could be expected from the combination 
of relations for use of dramatic play and 
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reality hostility, With one sex of child and 
Sex of parent exception. 


children's use of hostility to establish or 
defend their "rights" with age peers in- 
creases. This relation is accompanied by 
an increase in the number of hostile inter- 
actions with other children, and, for boys, 
by an increase in use of dramatic play 


not found for similar relations with dra- 
matic play hostility. The age differences in 


r’s for hostile interaction scores were large 
enough to be significant for only one rela- 
tion, as would be anticipated for the com- 


Year boys, while for boys of other ages the 


child subjects had an initial age range from 
3-4 to 5-5 years. 

Sex of child and Sex of parent.differences 
in degree of relation are apparent in in- 


to peers, Tf athers hád"not been ‘used as 
subjects of this investigation, relatively little 


hypothesis that Punishment leads to aggres- 
sion. 

Tn relations between the number of hostile 
interactions and the punitive contro] Scales, 


as the child. This tendency was not found 
in relations for dramatic play and reality 
hostility. “The Sex of parent differences. in 


reality.: hostility ` in“ preschool play--when 
Parents agreed More with ideas favoring 


Demand for Striving composite: (273, 47, 
апа 17) were significant and Similàr to those 
shown for this composite-scalé in Table 25, 
Relations for five of the eight scales in the 
Harsh Punitive Control composite (2; 3, 4, 
12; 17, 19, 20, and- 23) : have: this resem. 
blance, also: RDS Eton k 


HOME EXPERIENCE AND PLAY WITH PEERS 43 

The differences in relations for 44-54 Correlations for scores of fathers on 
year boys and other boys are less apparent other scales in Table 26 suggest a pattern 

- in Table 26 for the single scales entering of relations similar to those proposed by 
into the two punitive control composites, Mussen and Distler (1960) as possible ante- 
than for the composite scales in Table 25. cedents of identification in boys: the fathers’ 
An exception to the resemblance occurred influence on their sons includes both reward 
for mothers’ agreement with items on the and punishment. Greater agreement by these 
Harsh Punishment scale (No. 12).. In this fathers with the items of Approval of 
instance, the relation was positive and sig- activity, Comradeship and sharing, and 
nificant for 44-55 year boys, but lacked Autonomy scales accompanied a higher level 
r boys of other ages and for of reality hostility in their sons. These posi- 


significance fo 
tive relations for use of reality hostility 


girls. 
TABLE 26 
i PRODUCT-MOMENT CORRELATIONS BETWEEN PARENTS' SCORES ON SixcLE PARI SCALES AND 
THE FREQUENCY OF USE OF REALITY HOSTILITY BY GirLs, BY Boys AGED 416-516 YEARS, 
AND BY Boys OF OTHER AGES 
+ Fathers' scores and Mothers’ scores and 
reality hostility reality hostility 
PARI scale Boys Boys 
Girls Girls 
All 44-54 All 426-516 
but 414 years but 414 years 
1. Encourage verbalization —.09 .28 .18 .07 —.10 212 
2. Breaking the will .32* ETC .02 23 ‘a | — 12 
3. Strictness .20. |, -34* ni .03 14 —.23 
4, Deification of parent «soe 18 10 .32* .31 .28 
5. Suppression of aggression .16 —.16 —.47 —.18 —.08 14 
6. Equalitarianism | o8 .09 .42 —.07 06 AT 
7. Approval of activity EST .61*+ .38 .08 AB | —.14 
8. Avoidance of communication .01 ‚52** | —.04 УХ ol 07 
9, Suppression of sex —.05 .09 .03 —.08 —.13 —.26 
‚ 10. Comradeship and sharing .20 opre] .33 AT .25 1 
11. Deceit of child —.02 .63** .38 .00 .28 —.34 
12. Harsh punishment .34* .36* #42 11 22 :51* 
13. Expressing love ‘08 | —.02 .40 11 21 35 
= 14. Autonomy 14 ‚51** 43 | —-02 .30 AL 
. Intrusiveness 139% .22 —.23 „15 .26 —.17 
. Marital conflict .08 2.2 .14 .00 .20 ‚14 
. Excluding outside influences .08 .26 .01 УЛ .22 .03 
. Fostering dependency .20 —.20 —.15 .02 11 23 
. Irritability .03 —.07 .08 .32* .03 "Su 
. Seclusion of mother .24 .04 -.23 —.14 14 .06 
. Rejection of homemaking role —.08 .04 .02 ‚04 —.10 .29 
. Considerateness of spouse 2255) .21 = .21 03 .18 AT 
. Ascendance of parent 12 11 -10 WT 27 -.82 
. (Father) Disapproval of .AT** .06 Te .06 .19 42 
ascendance of mother 
Dependence of mother 


2 
x. Hs Significant at .05 level. 
* Significant at .01 level. 


44 


Occurred at the same time as the positive 
relations between use of reality hostility and 
punitive control scores of fathers. 

Overpossessiveness. Relations between 
parents’ scores on the Overpossessiveness 
composite scale and children’s observed 
hostility to peers were not the same as those 
between parents’ scores on punitive control 
scales and children's hostility. The relations 
differed in direction and in the type of hos- 
tility most affected by parents' scores, as is 
shown in Table 25. When fathers, but not 
mothers, agreed more with the ideas of the 
Overpossessiveness composite scale, chil- 
dren's use of hostility to carry out the roles 
of dramatic play decreased, the number of 
hostile interactions of boys decreased, and 
children's use of reality hostility did not 
change, except in a nonsignificant trend 
toward an increase. Parents’ attitudes about 
Overpossessiveness related negatively to 
children's use of dramatic play hostility, 
then, while parents' attitudes about punitive 
control related positively to children's use 
of reality hostility. These are major differ- 
ences in meaning for the dramatic play and 
reality classifications, 


TABLE 27 


AVERAGE CORRELATIONS BETWEEN PARENTS' Scores on PARI Composite ScALEs 
CHILDREN’S USE or DRAMATIC PLAY AND REALITY LANGUAGE WITH PRESCHOOL PEERS 
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The sex of parent differences in relations 
for the Overpossessiveness scores probably 
is connected with the time these parents 
spent talking with their children, particu- 
larly since these fathers did not differ from 
mothers in agreement with this scale, The 
time spent talking with the child by the 
average mother of this investigation, 37 
hours a week, would be considered by many 
to be excessive and indicative of overpos- 
sessiveness, Speculatively, these children 
may not have been able to detect or be 
affected by the differences in overpossessive- 
ness attitudes of mothers, On the other 
hand, the fathers spent enough time talking 
with the children to impress their attitudes 
upon the children, but it was only about 
half the time so spent by mothers. 

These speculations are about between- 
group tendencies, and the following rela- 
tions within the groups may not be a test 
of the speculations, Correlations between 
parents’ scores on the PART composite 
scales and the time family members and 
maids spent talking with the children were 
not in line with these speculations. Only. 


AND THE FREQUENCY OF 


Use of dramatic play suggestion Use of reality language 
Scale p of sex Suggestion Agreement 
Girls Boys difference 
in r's 
Girls Boys Girls Boys 

Scores of fathers: 

Suppression and distance | — 49s .09 -05 .01 .03 .24 .10 

Unhappiness at home —.11 EU .01 .01 .02 E — .06 

Demand for striving —.26 .28 -01 .04 15 18 4 

Overpossessiveness —.27 .04 ns .05 —.14 .42* B 

Harsh punitive control =:22 -30 .05 .07 01 21 12 
Scores of mothers: й [ 

Suppression and distance | — .07 .09 ns .05 .05 —.08 .02 

Unhappiness at home .06 —.02 ns .23 .06 21 .19 

Demand for striving —.04 —.01 ns .29* 07 EN 19 .33* 

Overpossessiveness —.24 —.03 ns .00 16 odi -20 

Harsh punitive control 03 .06 ns :06 128* aT 19 


? Significant at .05 level, 
** Significant at .01 level; 
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4 of the 80 possible correlations were large 
enough for significance. 


Use of Language with Peers 


Very few relations were found between 
parents’ PARI composite scores and chil- 
dren's use of language with peers, shown 
jn Table 27, as compared with those ob- 
- tained for use of hostility with peers, shown 

in Table 25. The differences in number of 
- relations with PARI scores denotes а differ- 
ence in meaning between children's friendly 
and hostile behavior that was not indicated 
by relations between aspects of children's 
social behavior, or by relations with other 
home experiences. 

Dramatic Play Language. The most evi- 
dent consistency in the data of Table 27 
is the sex of child difference in relations 
between fathers’ PARI composite scores 
and use of dramatic play suggestion. 
Fathers’ agreement with the ideas of the 
PARI scale seems to have inhibited or 
depressed girls’ expression of ideas in dra- 
matic play at preschool, but not to have 
affected or to have encouraged boys’ use of 
dramatic play language. The sex difference 
was significant for relations of four of the 
five fathers’ composite scores, and is not 
apparent for any scores of mothers. 

The fact that significant correlation 
among these negative relations was for the 
Suppression and Distance scale has added 
implications. The mean score of all parents 
strongly disagreed with this scale, as is 
shown in Table 23. The only father in the 
sample whose score indicated agreement 
(three points above the midpoint) was the 
father of a boy. In the disagreement range 
from 35 points to the midpoint of 70, there 
were two fathers of girls with scores of 
60 or higher, and five additional fathers 
with scores between 50 and 59 points. The 
other 34 fathers of girls had scores in the 
remaining 15 point range of strong dis- 
agreement. The significant relation was 
obtained, then, for an extremely narrow 
range of disagreement with the ideas of 
the Suppression and Distance scale. To 


> express the implication of this relation in 


positive terms: extremely small increases in 


fathers mild to strong approval of their 
daughters' expression of ideas accompanied 
fairly large increases in the frequency of 
their daughters' suggestions during dramatic 
play at preschool. 

The relations in Table 27 have the impli- 


from T 
within girls to please their fathers. 
could occur through 
mother's behavior as well as through an 
Electra complex. It could also be developed 
or reinforced through direct training at 
home and elsewhere. Preschool girls are 
not considered too young to be told by 
sundry adults in their world, "You are 
a girl, so you must learn to please men." 
Preschool boys are urged to please their 
mothers, but not women in general. 

There are, however, other explanations 
for these sex differences in relations. The 
finding that home experiences with dramatic 
play topics related to girls’ use of dramatic 
play language and hostility more frequently 
than to these measures of boys' behavior 
was given a different explanation in the 
preceding section. It was described as per- 
haps due to the high levels of both behaviors 
for the sample of boys and their parents. 
This explanation may apply also to the 
sex differences in relations between PARI 
scores and use of dramatic play language. 

Additionally, parents' scores on the PARI 
may indicate factors affecting the provision 
of home experiences with dramatic play 
topics for the child, and so have a multiple 
relation effect on the child's behavior. 
Credence can be given to this explanation 
of the sex differences in relations with 
fathers’ PARI scores, as shown by the data 
presented in Tables 28 and 29. Home ex- 
periences with dramatic play topics de- 
creased for girls when parents agreed more 
with the PARI scales, but parents' agree- 
ment accompanied no change or an increase 
in these experiences for boys. These sex 
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TABLE 28 


PARI composite scale 


Home source of 
information Suppression Unhappiness Demand for Overposses- Punitive 
and distance at home striving siveness | control 
Talk with father —.06 9.17 —.06 .06 | —.06 
.03 .04 .24 16 24 
Talk with mother LT —.06 ly, —.37* 357.22, 
—.07 СОЁ —.03 .08 —.04 
Talk with children 998 = .14 —.30* —.24 —.29 
—.12 .25 .23* AS 12 
Personal experience —.24 —.18 —.20 —.37* —.26* 
.02 .19 .26 .05 Jus 
Books and stories —.47+*+* 2:22 — .43%а — 4240 — 474% 
—.14 .03 -10* E —.08 
Television —.14 —.03 —.318 —.4 14 —.22^ 
5502 .04 .42*+ .21^ ive 
Note,—Boys' average r’s in italics, 
ж difference іп average r's is significant at :05 level, 
* Significant at .05 level. 
** Significant at. -01 level, 
TABLE 29 


Home source of 


PARI composite scale 


information Suppression Unhappiness Demand for Overposses- Punitive 
and distance at home | striving siveness control 
Talk with father =i 185 .05 .07 —.03 
-05 —.35** T = = 
Talk with mother LG 101 атн гов си 
—{07 =:02 h У = 
Talk with children esl 33 © L4 Ls 
02 15 і i T. 
Personal experience .25 —.05 = rd И 00 
EU 110 E : zo 
Books and stories 94 —.09 = © = Eo = d 
Hs —.16 -04 rs dl Ad Tree 
Television - 5-16 27 —.08 ls 08 
-26 0f zT .09 -14 


Note.—Boys' average r's in italics, 
? Sex difference in average 
* Significant at .05 level, 


778 is significant at .05 level, 
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differences in relations between parents’ 
attitudes and their provision of home ex- 
periences for the child resemble the relations 
shown in Table 27 between PARI scores of 
fathers and children's use of dramatic play 
suggestion. In both instances, sex differ- 
ences in relations are more often significant 
for fathers’ than for mothers’ PARI scores. 

The significant sex differences in Table 
28 may specify the home sources of infor- 
mation most affected by parents’ attitudes. 
Girls’ opportunities to learn about the dra- 
matic play topics of the preschool group 
through the information sources of books 
and stories, television, and personal experi- 
ence were less than those of boys when 
fathers agreed more with PARI composite 
scales. 

An attempt was made to determine the 
relative importance of home experiences 
with dramatic play topics and PARI scores 
in their opposing relations with children’s 
use of dramatic play language. Partial cor- 
relations were computed between 

а. Children's use of dramatic play sug- 
gestion 

b. Dramatic play topics checked by par- 
ents for the home information source of 
personal experience 

c. Fathers’ scores on the Harsh 

Control scale. 
For girls, the partial r's were Yap. =! 38, 
тоъ = —.38, and љеа = —40. In this 
instance, neither home experience Was par- 
tialed out as unimportant. For boys, the 
partial rs were ье = -12, Tach = 22, and 
foca = .29, and are too small to indicate 
relations. The relative importance of PARI 
scores and home experiences with dramatic 
play topics for children’s use of dramatic 
play language was judged to be impossible 
to determine from the data of this investi- 
gation, 

Correlations for mothers’ scores on the 
Unhappiness at Home and Overpossessive- 
ness scales are not in line with trends for 
other relations in Table 29. The significant 
relation for mothers’ Unhappiness at Home 
scores indicates that boys had fewer oppor- 
tunities to learn from fathers as mothers 
agreed more with this scale. Speculatively, 
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the usual trends might be reversed, as in 
this instance, if the father emphasized 
“pleasing the unhappy mother" more than 


- other aspects of family living. 


Significant age differences in relations 
reduced the size of sex group correlations 
for the Overpossessiveness scale in Table 
29. When mothers agreed more with the 
ideas of Overpossessiveness, younger chil- 
dren had fewer opportunities and older 
children had more opportunities to learn 
about dramatic play topics from talking 
with both parents. 

Relations between children's use of dra- 
matic play suggestion and parents' scores 
on single PARI scales resembled those for 
composite PARI scales but were not large 
enough for significance. Five of 96 average 
75 were significant, and this occurrence can 
be described as possible by chance. 

Reality Language. The few significant 
relations between PARI composite scores 
of parents and children's use of reality lan- 
guage, shown in Table 27, resemble on a 
lesser scale those found with children's use 
of reality hostility in Table 25. Children's 
use of reality suggestion and agreement 
with peers tended to increase as either 
parent agreed more with the Demand for 
Striving and the Harsh Punitive Control 
scales. These relations for use of reality 
language. combined with those demonstrated 
for time spent talking with the maid, suggest 
that increased of reality 
language with peers may have "undesirable" 
connotations, as well as being nonessential 
to getting along with peers at preschool. 

Similar relations were suggested, non- 
significantly, in the average 7's for parents’ 
scores on the 24 PARI scales. 

Relations between two other measures of 
use of reality language and PARI scores 
resemble on a larger scale those found for 
children’s use of reality hostility, as is 
shown in Table 30, The measures of use 
of reality greeting and reality question com- 
bined the frequency of use with both chil- 
dren and teachers that was separated for 
other measures of the dramatic play and 
reality classifications. This characteristic 
may influence the size of the correlations. 
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TABLE 30 


Reality greeting Reality question 
PARI composite scale 
Girls Boys Girls Boys 
Scores of fathers: 
Suppression and distance .50+* .18 .24 .22 
Unhappiness at home 375 —.09* =.11 —.02 
Demand for striving EIL —.10* .20 05 
Overpossessiveness .35s —.19* .19 16 
Harsh punitive control AD ees —.05* .20 15 
Scores of mothers: 
Suppression and distance ei 02 .12 21 
Unhappiness at home 27 06 .00 —.06 
Demand for striving +15 01 .51** .23 
Overpossessiveness 15 06 .36* .34* 
Harsh punitive control .21 05 .45** 24 


х Sex difference in average r’s is significant at .05 level, 
4 Significant at .05 level. 
** Significant at 01 level. 


Girls’ use of reality language to say 
"Hello" to children 


categories of the reality classification. This 
resemblance is found again for reality ques- 
tions in the relations with PART composite 
scalés, as is shown in Table 30. The appar- 
ent sex of child and sex of parent differ- 
ences in these relations were not significant. 
These relations signify that children ask 
more questions of children and teachers dur- 
ing preschool play when their Parents favor 
Punitive control and overpossessive attitudes 
about Tearing children, Presumably, then, 
this use of questions has a greater emotional 


scales ranged from 
mothers’ scores, the Tange was from —.26 
to .26. 

Social Acceptance and Participation 
at Preschool 


A glance at the relations for sociometric 
Scores in Table 31 should dispel any notion 
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that the parents' scores оп the PARI relate 
principally to undesired aspects of children's 
preschool behavior. Sociometric scores of 
boys and girls related significantly in oppo- 
site directions to fathers' scores on three 
PARI composite scales. Relations for 
mothers' scores were similar to those for 
fathers’ scores, but less frequently signifi- 
cant. 

Social acceptance in the preschool group 
had as close relations with parents' scores 
on the pathogenic PARI scales as those 
reported earlier for children's use of reality 
hostility with peers. Parents' scores on these 
scales, then, related as closely to a desired 
characteristic for children as the hostility 
thought to be engendered by these parent 
attitudes. The relations for social accept- 
ance are more complicated, because the close 
relations are in a negative direction for girls, 
and in a positive direction for boys, but are 
not less important than those with children's 
display of hostility. 

The sex difference in relations is clearer 
when relations are described positively for 
both sexes. Girls were more acceptable to 
their preschool peers when parents more 


strongly favored encouraging their daugh- 
ters to express ideas with as little control 
as possible, and also favored procedures of 
guidance that did not include punishment. 
On the other hand, boys were more accept- 
able to their preschool peers when parents 
agreed more with the ideas of suppression 
of expression and punitive control to en- 
force demands. 

These sex of child differences in relations 
occurred for a group of parents who had 
a relatively small range of divergence from 
the average attitude of disagreement with 
the ideas of several of the PARI composite 
scales. Minor differences in attitude, then, 
were associated with large differences in the 
social acceptability of their sons and daugh- 
ters at preschool. 

These differences in relations suggest that 
girls benefit more than boys from increased 
parental agreement with the “permissive” 
guidance procedures favored by most of 
these parents. The parents of boys, how- 
ever, disagreed more strongly with the Sup- 
pression and Distance and the Overposes- 
siveness scales than parents of girls, as was 
reported in the description of mean PARI 


TABLE 31 


AVERAGE CORRELATIONS FOR PARENTS' SCORES 


ох THE SOCIOMETRIC TEST AND WITH OBSERVATI 


on PARI COMPOSITE ScALES WITH CHILDREN'S SCORES 


ON MEASURES OF FRIENDLY INTERACTIONS 


WITH PRESCHOOL PEERS 


Sociometric scores All friendly interactions" 
PARI composite scale 
Girls Boys Girls D Boys 
Scores of fathers: 

Suppression and distance = .35% ‚28° —.36* —.01 
Unhappiness at home -.22 .12 14 .23 
Demand for striving — .46*#> 136" —.10 21. 
Overpossessiveness 15 —.16 —.10 —;21 
Harsh punitive control —.56**5 „А19 = .06 .21 

Scores of mothers: 
Suppression and distance IA :35+6 = .01 .06 
Unhappiness at home —.08 .01 —.07 —.08 
Demand for striving —.32 10 —.03 —.08 
. Overpossessiveness .09 12 .04 —.11 
. Harsh punitive control —.24 12 .05 .05 


a The sum of friendly interactions with peer 
b Sex difference in average r'sis significant at .05 level. 
ж Significant at .05 level. 

** Significant at .01 level. 


association + friendly approach + conversation. 
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scales. These parents had the usual attitudes 
that boys should have more freedom of 
expression than girls. The sex differences 
in relations for social acceptance raise ques- 
tions as to the merit of that opinion and 
Practice for the preschool ages, and indicate 
that preschool girls should be allowed and 
encouraged in more freedom of expression 
than preschool boys. 

The correlations between PARI scores 
and sociometric scores in Table 31 are larger 
than, but resemble those between PART 
scores and dramatic play use of language 
in Table 27, and do not resemble the non- 
significant correlations between PARI 
Scores and friendly interaction scores in 
Table 31. All three measures of children’s 
Social behavior have been described in this 
report as fairly closely interrelated. For 
PARI Scores, relations with children's socio- 
metric scores were more important than 
those with dramatic play use of language 
and friendly interaction Scores. For home 
experiences with dramatic play topics, rela- 
tions with dramatic play use of language 
and hostility were more important than 
those with friendly interaction Scores and 
Sociometric scores, The measures of child 
social behavior clearly have different mean- 
ings, despite their resemblance, 

The relations between PARI Scores and 
the three measures of children's social be- 


children's Sociometric scores and observa- 
tion measures of Social participation, such 


investigation, described in a review of social 
acceptance research by 

The relations in the present investigation 
mean that parents’ attitudes about rearing 
children relate more closely to the number 


metric test, than to Observation measures 


The sex of child differences in relations 
between sociometric Scores and PARI scores 


contradict the idea that the "good" parent 
has’ usually approved" attitudes, implicit 
in the description of the PARI scales as 
"pathogenic" by Schaefer and Bell (1958, 


p. 346). 


Test Aggression 


Relations between children's test aggres- 
Sion scores and parents' scores оп PARI 
composite scales, shown in Table 32, re- 
semble corresponding relations for use of 
reality language and hostility in Tables 25 
and 27, with one Sex of child and sex of 
Parent exception. Positive relations were 
found in most correlations for the two puni- 
tive control scales and for the Unhappiness 
at Home scale. These relations also support 
hypotheses that punishment leads to aggres- 
sion; the aggression in these instances was 
toward the experimenter in a doll play test 
situation. Other relations reported for the 
preschool hostility and the doll play aggres- 
sion have Suggested that these two behaviors 
had no common meaning. 

The exception to these trends occurred 
for relations between girls' aggression scores 
and fathers' scores on the two punitive con- 


TABLE 32 


AVERAGE CORRELATIONS BETWEEN PARENTS' 
Scores on PARI COMPOSITE SCALES 
AND CHILDREN'S SCORES ON THE 
TEST OF AGGRESSION 


Test aggression scores and— 


к= L ш. 


PARI Fathers' Mothers' 
composite PARI PARI 
scale Scores Scores 
S Be A A и 
Girls Boys | Girls Boys 


Suppression and dis- | .26 
tance 
Unhappiness at home | 03 [| .30 А1* | 19 
Demand for Striving |—.38»« :369* | 23 | 36% 
Overpossessiveness -20 47 |07 |.22 
Harsh punitive con. .02 32* 1.12 | 06 
trol 


—03 1.077 220 


{ Sex difference in average v's is significant at ‚05 level, 
* Significant at .05 level. 


| 


би: ТАВГЕ 33 


pn "AVERAGE CORRELATIONS BETWEEN PARENTS' 
Scores ох PARI COMPOSITE SCALES AND 
YEARS OF PARENTS’ EDUCATION 


Years of parents’ education 
and— 
i 
PARI composite Fathers’ Mothers’ 
scale PARI scores | PARI scores 
Girls | Boys | Girls Boys 
Suppression and дож |—.25 |—.35 |—:36* 
"distance 
< Unhappiness at 23201 2:42» —:17 —.01 
ү, _ home 
Demand for striving 51а |26 |—.31 |—.11 
___Оуегровзеззїуепезв — 46»* | —.32*|—.15 29 
Harsh punitive L 51**|—.29 |—.37* —.30 
" control 


* Significant at .05 level. 
** Significant at .01 level. 


trol scales, as is shown in Table 32. Rela- 
tions for girls’ and fathers’ scores are more 
like corresponding relations for sociometric 
scores and dramatic play language than 
those for reality language and hostility. 


Parents’ Education 


Parents’ agreement with the PARI com- 
posite scales decreased as their years of 
education past high school increased. Five 
of these negative relations for fathers, 
shown in Table 33, are large enough for 
significance, and two are significant for 
mothers. This finding may be associated 
with two other relations described earlier 
in this report: a positive 7 of .25 between 
fathers’ education and the sociometric scores 
of all children, and the sex differences in 
relations between PARI scores and socio- 
metric scores. 

"These relations also тау be another in- 
stance of the sensitivity of PARI scores in 
. relations with other measures. Small varia- 
-tions in PARI scores have been reported in 
this section to accompany small and large 
variations in other measures. Few relations 


vs 
Жү 


. have been demonstrated for parents' educa- 


HOME EXPERIENCE AND PLAY WITH PEERS 51 


tion in this investigation, apparently because 
of the limited variation resulting from the 
control of sociometric status. 

In relations between education of spouse 
and PARI scores, significant relations for 
both parents were limited to a negative rela- 
tion with scores on the Harsh Punitive 
Control scale. 


Other Home Experiences 


Parents! scores on PARI composite scores 
did not relate to the time children spent at 
home listening to stories and records or 
watching television. 

Relations for home experiences with dra- 
matic play topics and time spent talking with 
family members and maids have been re- 
ported with relations for use of dramatic 


play language and for the Overpossessive- 
ness scale, respectively, earlier in this 
section. 

Summary 


The findings of this section give strong 
support to the supposition that parents’ atti- 
tudes affect their child's behavior. PARI 
scores of parents entered into the conjec- 
tured relations with children's hostility and 
related to almost all measures of this inves- 
tigation. 

The frequency of children's use of hos- 
tility to establish or defend their "rights" 
with age peers increased as parents' agree- 
ment with statements favoring punitive con- 
trol of children increased (or their disagree- 
ment decreased, the more accurate descrip- 
tion for these parents). The exceptions to 
this finding were 44-54 year boys; varia- 
tions in the high frequency of their use of 
reality hostility did not relate to punitive 
control scores of parents. Positive relations 
of a smaller size were found between puni- 
tive control scores and the number of hostile 
interactions with other children, and, for 
boys, the use of dramatic play hostility. 
Hypotheses that punishment leads to aggres- 
sion were supported by these relations and 
by similar relations for children's aggression 
during a doll play test. These hypotheses 
were given more support by relations for 
fathers’ scores than for mothers' scores. 
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Different relations with children's hos- 
tility occurred for parents’ scores on the 
Overpossessiveness scale of the PARI. 
There was a decrease in children's use of 
hostility to carry out the roles of dramatic 
play as fathers, but not mothers, agreed 
more with the ideas of this scale. 

Girls were more popular among preschool 
peers when their parents strongly disagreed 
with the suppression and punitive control 
Scales, but boys were more popular when 
their parents disagreed less, or agreed more, 
with the ideas of these PARI scales. These 
relations were large and significant for both 
sexes, despite the difference in direction, 
and the most important of those found for 
desired characteristics of children. 

Fathers' agreement with the ideas of all 
PARI composite scales seemed to inhibit 
or depress girls’ expression of ideas in dra- 
matic play at preschool, but to have not 
affected or to have encouraged boys' use 
of dramatic play language. Home experi- 
ences with dramatic play topics also de- 
creased for girls when parents agreed more 
with the PART scales, and parents' agree- 
ment accompanied no change or an increase 
in these experiences for boys. 

Children's use of reality language with 
peers tended to increase as either parent 
‘agreed more with the punitive control scales. 
This trend was most marked for the meas- 
ures that included use with both children 
and teachers : greeting (by girls), and ques- 
tion (by all children). Children's tested 
vocabulary knowledge lacked the emotional 
meaning of relations with PARI scores. 

Parents' agreement with the PARI com- 
posite scales decreased as their years past 
high school increased. 

The only measures that failed to relate 
to PARI scores of parents were measures 
of time spent at home at two types of 
activities: talking with family members and 
maids, and listening to stories and records 
and watching television. 

Minor differences in parents' Scores, par- 
ticularly fathers' scores, were associated 
with large differences in children's behavior. 
These parents diverged little from the mean 
scores indicating disagreement. with most of 
the pathogenic PARI scales. Agreement 


between fathers and mothers in the same 
family was not particularly high. 


Relation of Children’s Dependence on 
Teachers to Their Behavior with Age 
Peers and to Their Home Experiences 


Children have social contacts with teach- 
ers as well as with children during child 
directed play at preschool. The frequency 
of these child-teacher interactions can be 
used as a measure of the child’s dependence 
on adults, as has been described by Marshall 
and McCandless (1957a). This section de- 
scribes the use of language and hostility 
during interactions between children and 
teachers recorded for the children of this 
investigation, and the relations between this 
dependence on teachers and both their be- 
havior with peers and their home experi- 
ences. These relations were expected to 
replicate and to help explain the negative 
relations between dependence on teachers 
and interactions with and social acceptance 
by peers, reported by Marshall and Mc- 
Candless for 36 Iowa children. 


Children’s Dependence on Teachers 


Children’s dependence on friendly con- 
tacts with teachers during preschool play 
declined as age increased from 24 to 64 
years, as is shown in all figures for de- 
pendence measures in this section (Figures 
10, 11, and 12). This decline was greatest 
between 23 and 41 years, but continued 
through 6} years. The effect of age was 
significant in all analysis of variance tests, 
and there were no sex differences for meas- 
ures of friendly dependence on teachers. 

The youngest children, aged 24-34 years, 
had a friendly interaction with a teacher 
once every 4 minutes (or two observation 
records), as is shown in Figure 10. This is 
one-third as frequent as the between chil- 
dren friendly interactions for this age group, 
shown earlier in Figures 5 and 6. The oldest 
children, those aged 54-64 years, had a 
friendly interaction with a teacher about 
once in 20 minutes, but had four friendly 
interactions with children during each 2- 
minute observation record. 


_ sive teacher contacts. 


— These age differences 
"mental decrease for children's dependence 
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denote a develop- 


on teachers during the preschool years. 
They indicate that the age of the child 
needs consideration before the child is 
labeled “dependent,” or a child with exces- 
Age differences shown 
in Figure 10 are very large, and lead to the 
inference that comparisons of the number 
of child-teacher interactions should be with- 
in a single year, rather than within a pre- 
school group with an age range from 23 to 
44 years, or from 31 to 54 years. The latter 
age range of subjects has been used in most 
studies of dependence on preschool teachers, 
without allowing for age differences. The 
expected greater dependence on teachers of 
younger children may have been equated 
With excessive dependence in these studies. 

The developmental decrease in depend- 
ence on teachers has implications about 
children’s learning. During the same cir- 
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Fic, 10. Mean number of all friendly interactions 
(association + friendly approach + conversation) 
and mean number of hostile interactions between 
children and teachers per 2-minute observation 
record for each age group. 


cumstances, child directed play, the youngest 
children had five times as many interactions 
with teachers as the oldest children in this 
investigation. Obviously, the younger chil- 
dren “needed” more attention from teachers 
to enjoy and learn from direction of their 
own play. The older children apparently 
satisfied their “needs” through their doubled 
number of friendly interactions with chil- 
dren, In other words, the older children 
were “learning” primarily from age peers, 
while the younger children were directly 
influenced by their teachers. These results 
justify the belief of preschool educators 
that children aged 24-4} years need a 
smaller pupil-teacher ratio than children 
aged 44-63 years. Additionally, they sug- 
gest that teachers need different abilities 
and teaching methods for the two age 
groups. 

The number of friendly interactions with 
teachers for the 70 children in the two age 
groups from 34 to 5% years, shown in 
Figure 10, were compared with those for 
children in the same age range observed in 
two other studies. The Kentucky children 
had about the same number of dependence 
interactions as the 60 Hawaiian children 
observed by McCandless, Bilous, and Ben- 
nett (1961), but the 36 Iowa children 
studied by McCandless and Marshall 
(1957b) had about twice as many depend- 
ence interactions as either of these groups. 

Children usually spoke as themselves in 
their interactions with teachers, and seldom 
used the dramatic play classification in such 
interactions. About half of the children 
failed to use the dramatic play classification 
of language with teachers, For example, 
dramatic play suggestion, the most fre- 
quently used category in that classification, 
was used with teachers by 24 girls and 25 
boys. It was used in 3% of the observation 
records for girls, and in 4% of the observa- 
tion records for boys. This limited fre- 
quency of use for the dramatic play classi- 
fication meant that it could not be used as 
a measure of dependence on teachers, and 
it is not described further in this section. 

All language measures of dependence on 
teachers, then, are those classified as reality 
use of language. The children were talking 
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as themselves and in their own behalf, 
rather than in carrying out a role in dra- 
matic play. 

The decrease in the frequency of chil- 
dren's suggestions to and agreement with 
teachers as age increased is shown in Figure 
ll. The rate of age decline approximated 
halving each year the number of suggestions 
or agreements made by the preceding age 
year group. 

The frequencies for suggestion and agree- 
ment from teachers to children are shown 
in Figure 12. These appear to closely re- 
resemble children's use of the categories 
shown in Figure 11. 

Close relations were found between these 
child and teacher measures in correlations, 
as is shown in Table 34. The results support 
the grouping of friendly interactions with 
teachers and of use of language to and from 
teachers under the classification of friendly 
dependence on teachers. 

Hostile interactions between children and 
teachers occurred much less frequently than 
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in which children made suggestions to and agreed 
with teachers for each age group. 
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friendly interactions in all age groups, as 
is shown in Figure 10. The frequency of 
hostility directed by children to teachers is 
shown in Figure 13, and that for hostility 
directed by teachers to children in Figure 
14. These data suggest that dependence 
hostility seldom occurred during child di- 
rected play. Hostility from children and 
from teachers were closely related; the 
average r's between these measures were 
.69 for girls, and .87 for boys. 

Age and sex differences in frequency of 
dependence hostile interactions were sig- 
nificant at the .005 level in analysis of 
variance tests. Boys had twice as many 
hostile interactions with teachers as girls. 
Children in the 54-64 year group had fewer 
hostile interactions with teachers {һап 
younger children. 

Whether the child or the teacher initiated 
hostile interactions was not recorded. The 
Observer's impression was that teachers’ 
hostility tended to be in the form of "nicely" 
suggesting that the child do something other 
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TABLE 


RRELATIONS BETWEEN FRIENDLY DEPENDENCE 


55 


34 


MEASURES AS SHOWN IN AVERAGE CORRELATIONS | 


'eement Suggestion Agreement 
of child from teacher of teacher 
with teacher to child with child 
.56* 
Suggestion from teacher to child .66* .74* 
575% .81* 
Agreement of teacher with child .90* .15* .65* 
.93* .54* .64* 
Interactions with teacher .88* .67* .T0* .89* 
.83* .72* 157% .83* 


Note.—Boys’ average r's in italics. 
* Significant at .01 level. 


than his current activity. Records for one 
preschool group suggest that hostility may 
have been initiated by the teacher. In this 
preschool group, the head teacher consist- 
ently interrupted conflicts between children 


to urge the children to talk, rather than to 
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& 5 Fic, 13. Mean percentage of observation records 
in which children showed hostility to teachers for 


each age group. 
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hit or push, and this practice was not fol- 
lowed by other teachers in this investigation. 
The 17 children in that preschool group had 
a hostile interaction with the teacher in а 
mean of 1096 of the records, and. for the 
six 44-54 year boys, the mean was 14%, 
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їп which teachers showed hostility to children for 
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The limited use of hostility between chil- 
dren and teachers indicated that the ob- 
tained data did not discriminate between 
children enough for use as measures of 
dependence on teachers. Hence, the rela- 
tions for these measures are not described. 


Relations between Dependence on 
Teachers and Behavior with Peers 


Interactions with Peers. This investiga- 
tion replicated the negative relations between 
dependence оп teachers and children's 
friendly peer interaction scores reported by 
Marshall and McCandless (1957a). Nega- 
tive relations predominate in the average 
f's between these measures for children 
under 54 years of age that are shown in 
Table 35. Relations for the measure used 
by Marshall and McCandless are shown in 
the last line of Table 35. On the basis of 
results from two studies, then, it can be said 
that children in each age group under 54 
years will have more interactions with 
teachers when they have fewer interactions 
with peers, and vice versa. 

At 54-64 years, relations between these 
measures were positive for boys and nega- 
tive for girls. Four r’s for 54-6} year boys 
differed at the .05 level from the average 


r’s for boys of other ages. These differences 
may indicate the beginning of the disappear- 
ance of dependence on adults during ele- 
mentary school years that was reported by 
Wittenborn (1956). 

The apparent sex differences in the rela- 
tions shown in Table 35 were not significant, 
The somewhat larger negative relations 
for girls are in line, however, with the 
significant sex differences in these relations 
reported by McCandless and Marshall 
(1957b) for 36 Iowa children. The Ken- 
tucky children had a larger number of inter- 
actions with other children and about half 
the interactions with teachers of the Iowa 
children, as was reported earlier. These 
differences in the extent of both behaviors 
may have affected the sex differences in 
relations between the behaviors. 


Sociometric Scores. This investigation 
did not replicate the negative relations be- 
tween dependence on teachers and social 
acceptance in the preschool group reported 
by Marshall and McCandless 
Average r’s for boys and girls between 
dependence measures and sociometric scores 


ranged from —.18 to .13. This difference 


in results for the two studies may be asso- 
ciated with the greater dependence of the 


TABLE 35 


AVERAGE CORRELATIONS BETWEEN MEASURES ОЕ DEPENDENCE ON TEACHERS AND 
PEER INTERACTION SCORES OF GIRLS AND Boys UNDER 514 YEARS OF AGE 


Peer interaction scores 
Dependence measures 
Friendly All friendly 
Association approach Conversation | (А + FA + C) 
Suggestion from child to teacher — .36* —.13 — .08 —.30 
—.33 —.04 —.30 —.24 
Agreement of child with teacher —.49** —.38 —.82 —.50** 
= .28 ed 40. —.22 
Suggestion from teacher to child —.33 =.33 —.27 —.42* 
d —.08 —.34 =.18 
Agreement of teacher with child —.52** —.34 —.34 = 50%* 
—.16 .04 .22 .22 
Interactions with teacher —.44* —.30 —.34 —.30 
—.25 .04 —.43* =.21 


Note.— Boys' average r’s in italics. 
* Significant at .05 level. 
** Significant at .01 level. 


(1957a). — 
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Towa children, mentioned in the preceding 
However, McCandless et al. 
(1961) found evidence of this negative rela- 
tion in their study of Hawaiian children. 
Their findings suggest this relation is closer 
when the dependence is emotional rather 
than instrumental, a distinction not made 
for dependence measures in this investiga- 
tion. 

Use of Language and Hostility with 
Peers. Relations between dependence on 
teachers and use of language and hostility 
with peers were found for boys, but not for 
girls, as is shown in Table 36. As depend- 
ence on teachers increased, boys used reality 
language more frequently with peers, and 
less frequently used dramatic play hostility 
with peers. In other words, boys used a 
behavior not essential to getting along with 
peers more frequently, and a behavior essen- 
tial to getting along with peers less fre- 
quently when they were more dependent on 
teachers at preschool. 

The sex differences in relations between 
dependence and use of reality language were 
not found in relations between dependence 


mand peer interaction scores, shown in Table 


35, or in corresponding relations for socio- 
metric scores. They suggest that either the 


manifestations or the antecedents of depend- 
ence on teachers may not be exactly the 
same for boys and girls. 

Relations between use of reality hostility 
with peers and dependence on teachers are 
not included in Table 36 because of signifi- 
cant differences in relations for 44-54 year 
boys and girls from those for other children 
that were found for four of the five depend- 
ence measures. In the other age groups, use 
of reality hostility did not relate to depend- 
ence on teachers, as is shown in the range 
of average r's from —.23 to 10. Children 
in the 44-54 year group used hostility in 
their own behalf more frequently when they 
showed more dependence on teachers. Aver- 
age r's for this age group ranged from .23 
to .43, and three were 40 or above and 
significant. These age differences are an- 
other instance of atypical child behavior for 
the 44-54 year boys and girls. 

The two categories of the reality classifi- 
cation that combined use of language with 
children and adults, question and greeting, 
would be expected to have positive relations 
with measures of children’s use of language 
with adults. Relations for the frequency of 
questions had the expected positive relations 
with dependence measures. Average r's for 


TABLE 36 


AVERAGE CORRELATIONS BETWEEN Measures OF DEPEND! 
Use or LANGUAGE AND HOSTILITY WITH PEERS BY GiRLs AND Boys 


ENCE ON TEACHERS AND THE FREQUENCY OF 


Dramatic play Reality 
Dependence measures 2 s 
Suggestion Hostility Suggestion Agreement 
i i —.10 .29 :06 
Suggestion from child to teacher И з EL un .32* 
Agreement of child with teacher —.19 —.18 .00 15 
00 — .33* .35* .30 
Suggesti i = .15 = .15 -.04 —.10* 
ggestion from teacher to child nos EU 7 pem 
A PIE 57 —.25 42 .06* 
greement of teacher with child F "E SE pm m 
Interacti i —.08 —.19 13 = 10% 
ctions with teacher 102 —.23 .34* 46s 


y Note.—Boys' average y's in italics. | 

$ Sex difference in average y's is significant at .05 level. 
4% Significant at .05 level. 

Significant at .01 level. 
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boys and girls were similar, ranged from 
.29 to .56, and were significant in 7 of the 
10 possible relations. 

The expectation of positive relations was 
not verified by relations in all age and sex 
groups for the category of greeting, as is 
shown in Table 37. As children’s age in- 
creased, the relations between the frequency 
of greeting and dependence on teachers 
changed from a positive to a negative direc- 
tion. 

There were no age differences in the 
mean frequency of use of reality greeting 
by these children. Consequently, the rela- 
tions with dependence measures suggest that 
frequent greeting of others was inhibited 
by more dependent children as age in- 
creased. Reality greeting failed to relate to 
any other measures of children’s social 
behavior. This inhibition for dependent 
children may be due, then, to failure of this 
approach to obtain rewards from peers. 
Dependent children obviously are rewarded 
by teachers. 

Vocabulary Age. A consistent positive 
direction of relations and two significant 
r’s suggest that as children's vocabulary age 
increased, they talked more frequently with 
teachers, and so were more dependent on 
teachers by the definition of this investiga- 
tion. The two significant 7's occurred for 
relations that differed with the age or sex 
of the child. For most children, a higher 
vocabulary age accompanied more teacher 


TABLE 37 


AVERAGE CORRELATIONS BETWEEN MEASURES OF DEPENDENCE AND CHILDREN’S USE OF 
SELF GREETING FOR AGE GROUPS AND FOR SEX GROUPS 


agreement with the child's suggestions. 
(average r — .46). The exceptions were 
44-54 year children, for whom the average 
r was —.11. Girls agreed more frequent 
with teachers’ suggestions when their vocab- 
ulary age was higher (average r = Al), 
but boys agreed less frequently with teach- 
ers as their knowledge of vocabulary 
increased (average r — —.25). 


| 


Hypotheses to be Tested about the Home і 
Antecedents of Dependence on Teachers 


Моң hypotheses about the асси of. 


years emphasize a motivation E the 
child for undue attention from adults. Th 
motivation is thought to result in a lack of 
motivation toward interactions with peer: 
and, consequently, a lack of skill in such 
relations. The antecedents most frequently 
studied as causes of this motivation have 
been suppression and deprivation (Gewirtz 
1954; Hartup, 1958), and overpossessi 
or extremely gratifying parents (Crandall, 
Preston, & Rabson, 1960; McCandless et al., : 
(1961). Suppression dud deprivation are 
hypothesized to create a “hunger” for тоге i 
pleasant adult contacts, while gratification 
is believed to create an approach drive 
through its rewards for dependent behavior. — 
A different hypothesis about the ante- 
cedents of excessive dependence in pi 
school children is rarely mentioned (Mar 


Age groups Sex groups 

p of age 

Measures of dependence difference 
Under 434 | Over 414 Girls Boys 
Suggestion from child to teacher .53%а —.33* .01 —.49** .00 
Agreement of child with teacher 11 —:21 ns —.04 —.09 
Suggestion from teacher to child .33* —.20 .05 —.19 .05 
Agreement of teacher with child .35* 21425 .05 —.20 Am 
Interactions with teacher 17 —.26 .05 —.29 12 


^ These average r's and the р are for boys’ groups only, not all children as in other 7's listed. Age differences in this relation AE 


were not found for girls. 
* Significant at .05 level. 
** Significant at .01 level. 
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. shall & McCandless, 19572), and has not 
| been studied in research. This hypothesis 


has failure with peers as the antecedent of . 


excessive dependence on teachers. Accord- 
ing to this explanation, children show exces- 
sive dependence because they lack the tech- 
niques and interests required for participa- 
tion in play with peers, and often fail in 
their attempts to play. The finding of Mar- 
shall and McCandless that consistent rela- 
tions were not obtained during the first 
3-4 weeks of preschool experience between 
dependence scores and measures of peer 
participation and acceptance is in line with 
this hypothesis ; relations for excessive de- 
pendence on teachers existed only after 
^" several weeks of success and failure in play 
with preschool peers. 

Speculation suggests that causes of the 
inadequate techniques and failure with 
peers might be similar to those frequently 
cited for failure in reading: too few experi- 
ences at home that prepare for the task of 
learning to read in elementary school. For 
preschool children, the causes of depend- 
ence might be too few experiences at home 
that prepare for the task of learning to play 
with peers. The existence of such home 
experiences has been demonstrated in this 
investigation. More home experiences with 
the topics of dramatic play accompanied 
more talk and participation in play with 
peers. Social acceptance of girls increased 
as parents indicated stronger belief that 
their daughters should express ideas freely, 
although for boys the reverse wa5 true. 

The child and parent measures of this 
investigation permit an attempt to test the 
second hypothesis, as well as the first hy- 
pothesis and findings of other investigators. 
Correlations between measures of depend- 
ence and percentages of dramatic play topics 
checked by parents for home sources of 
information should provide evidence on the 
speculation about home experiences in- 
cluded in the second hypothesis. Two of 
the PARI scales are named for, and thought 
to include, the parental attitudes studied 
as antecedents of dependence in the first 
hypothesis. Correlations between measures 
Ы of dependence and parents' scores on the 
Suppression and Distance and the Over- 


possessiveness scales should constitute a test 
of this hypothesis. The relations between 
dependence measures and the other home 
experiences studied in this investigation may 
indicate support for either hypothesis. 


Relations between Dependence on Teachers 
and Home Experiences 


Home Experiences with Dramatic Play 
Topics. Dependence on teachers increased 
as children had fewer opportunities to learn 
about the dramatic play topics of the pre- 
school group from home sources of infor- 
mation. Negative relations between these 
measures occurred more frequently for 
younger than for older children, however, 
аз is shown in Table 38. 

The correlations between dependence 
measures and home experiences with dra- 
matic play topics afford the best test in this 
investigation of the second hypothesis about 
causes of dependence in preschool children: 
When children have too few experiences at 
home that provide the techniques and inter- 
est required for participation in play with 
age peers, they will often fail in their 
attempts to play with peers, and, as а con- 
sequence, will show excessive dependence 
on teachers at preschool. The negative di- 
rection and size of the correlations for 
younger children, shown in Table 38, fur- 
nish strong support for this hypothesis. As 
these children had fewer home experiences 
with the dramatic play topics of the pre- 
school group, they showed more dependence 
on their teachers, and vice versa. The rela- 
tions between home experiences with dra- 
matic play topics and measures of play with 
peers for these children also agreed with 
the hypothesis, as has been reported. 

The age differences in relations indicate 
that this factor may affect the hypothesis. 
This hypothesis may explain more instances 
of excessive dependence during the years 
from 24 to 44, when dependence on teachers 
is at a high level for all children, than dur- 
ing the later preschool years when most 
children аге relatively independent of 
teachers. 

The correlations with dependence meas- 
ures in Table 38 indicate that three home 
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TABLE 38 


AVERAGE CORRELATIONS BETWEEN THREE MEASURES OF DEPENDENCE ON TEACHERS AND THE 
PERCENTAGES OF DRAMATIC PLAY Topics CHECKED FOR HOME SOURCES OF INFORMATION 
BY PARENTS OF YOUNGER (214-414 YEARS) AND OLDER (414-614 YEARS) CHILDREN 


Suggestion from Agreement of Interactions 
child to teacher teacher with child with teacher 
Home source of information 
Younger| Older | Younger| Older | Younger| Older 
Talk with father odis «13^ 93980 220850 —.34 13 
Talk with mother =.31* .20* —429 m —.27 dyi 
Talk with children: Girls —.64** .60* —.98*» .36* —.61** 348 
All boys —.20 —.33 —.29 
Personal experience —.47** .04* —.43* —.17 —.40* —.01 
Books and stories —.19 .04 —.21 .05 —.16 .10 
"Television =.19 —.28 —.20 —.28 —.06 —.32 


a Age difference in average >'з is significant at .05 level. 


b These average 7'з are for boys, only. The average r for all girls was —.35. 


* Significant at .05 level. 


experiences related more closely to the 
dependence on teachers of younger children 
than experiences with most other home 
sources of information. These experiences 
were obtaining the information about dra- 
matic play topics through the sources of talk 
with father, talk with mother, and personal 
experience. These experiences of learning 
from adults important to the child were 
the home experiences that related most 
closely to children's use of dramatic play 
language and hostility with peers, as was 
reported earlier. Age differences in rela- 
tions for both boys and girls were significant 
for these experiences. 

The other home experience that correlated 
significantly with the dependence shown by 
younger children was talk with children. 
The percentage of dramatic play topics 
checked for this home information source 
entered into larger correlations with depend- 
ence on teachers, shown in Table 38, than 
with use of dramatic play language and 
hostility with peers, shown in Table 17. The 
greater importance of this information 
source in relations with dependence is sug- 
gested, also, by the negative relations for 
boys of all ages, as well as for younger girls. 
Opportunities to learn about dramatic play 
topics from brothers and sisters, or from 
children in the neighborhood, include oppor- 


tunities to learn techniques of play with 
peers. These experiences might provide 
both techniques of play with peers and 
interest in such play. Acquisition of both 
techniques and interest was hypothesized to 
increase the chances of success with peers 
at preschool, and, consequently, to reduce 
dependence on teachers. 

The relations between dependence meas- 
ures and home experiences with dramatic 
play topics suggest a teaching procedure for 
dependent children. Teachers could use the 
dependence conditions (excessive talk with 
teacher) to provide information about the 
dramatic play topics of the preschool group. 
According to the hypothesis and these find- 
ings, this procedure would increase the 
child's interest and probable success in play 
with peers, and hence lead to less depend- 
ence on the teacher. 

Correlations for the two dependence 
measures not shown in Table 38 were 
similar in direction to those listed, but were 
seldom significant. 

PARI Scores of Parents. Scores of par- 
ents on the PARI composite scales entered 
into fewer relations with measures of chil- 
dren's dependence on teachers than with 
other measures of children's preschool be- 
havior. The only evident consistencies or 
significant relations are for those correla- 


= 
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TABLE 39 


AVERAGE CORRELATIONS BETWEEN MEASURES OF DEPENDENCE AND FATHERS’ SCORES ON 
Composite PARI SCALES For GIRLS AND Boys 


Fathers’ PARI scores 


Dependence measures 
Suppression Unhappiness Demand for | Overposses- Punitive 
and distance| at home striving siveness control 
Suggestion from child to teacher .30 —.05 10 .02 .10 
—.17 —.18 —.19 —.05 —.26 
Agreement by child with teacher .40* —.09 .37* zur ‚29 
.07 —.06 .09 .22 Uf 
Suggestion from teacher to child 27 —.19 .12 .30 .04 
.03 —.10 —.03 14 .00 
Agreement of teacher with child .26 —.09 .18 ла 12 
—.09 —.22 —.21 .00 —.24 
Interactions with teacher .22 —.21 .06 .05 .05 
.06 —.09 —.18 .08 —.13 
Note.—Boys' average r's in italics. 
* Significant at .05 level. 
TABLE 40 


tions proposed as tests of the first hypothesis 
that dependence precedes decreased partici- 
pation with peers at preschool. 

For girls, but not boys, consistent positive 
relations are found between dependence 
measures and fathers' scores on the Sup- 
pression and Distance and the Demand for 
Striving scales. The r's are shown in Table 
39. Two of these are significant. These 
relations suggest that girls are more depend- 
ent on teachers when fathers express atti- 
tudes of suppression and deprivation. This 
was studied as a cause of dependence by 
Gewirtz (1954) and Hartup (1958). Simi- 
lar relations were not indicated for these 
PARI scores of mothers. These relations are 
in line with the hypothesis, then, but can- 
not be described as strong support for it. 

For boys, the only apparent relations were 
those shown in Table 40 between depend- 
ence measures and mothers’ scores On the 
Overpossessiveness scale. Younger boys 
showed more dependence on teachers when 
mothers expressed overpossessive attitudes, 
but these attitudes did not relate to the 
dependence of boys older than 44 years. 
Relations for younger boys are in line with 
the hypothesis that dependence is a conse- 


4 quence of overpossessive or extremely grati- 


fying parents, presented in detail by Cran- 


AVERAGE CORRELATIONS BETWEEN MEASURES OF 
DEPENDENCE FOR YOUNGER AND OLDER Boys AND 
MOTHERS’ SCORES ON THE OVERPOSSESSIVENESS 
COMPOSITE SCALE OF THE PARI 


Boys’ age groups 


| 


2%4-4%4 | 414-628 


Dependence measures 


years | years 
Suggestion from child to teacher | .65** 028 
Ahh —.10" 


Agreement by child with teacher 
Suggestion from teacher to child | .60* 24" 
Agreement of teacher with child oars 
Interactions with teacher 


a Age difference in average y's is significant at .05 level. 
* Significant at .05 level. 
** Significant at .01 level. 


dall et al. (1960) and McCandless et al. 
(1961). Similar relations were not indi- 
cated for these PARI scores of fathers 01 
for dependence of girls. 

The interpretation of the relations showr 
in Table 40 differs from that of the pre 
ceding paragraph, however, if they are con 
sidered in conjunction with the finding abou 
age differences in relations between mother: 
scores on the Overpossessiveness scale an 
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home experiences with dramatic play topics, 
described in the preceding section. Younger 
children had fewer opportunities to learn 
about dramatic play topics from talk with 
both parents when mothers' agreed more 
with the ideas of the Overpossessiveness 
scale, while older children had more learn- 
ing experience under these circumstances. 
Considering these relations, the data in 
Table 40 are in line with the hypothesis that 
lack of home preparation for play with peers 
results in frequent failure in this play and 
a consequent dependence on teachers at 
preschool. 

The paucity of other relations with PARI 
scores of parents may mean that the depend- 
ence shown by the children of this investi- 
gation was not as “emotional” as that 
studied by other investigators or found in 
other samples of children. This possibility 
was mentioned earlier as a reason for the 
failure to find relations between dependence 
and sociometric scores in this investigation. 

Time Spent Talking with Individuals at 
Home. Positive relations are shown in 
Table 41 between dependence measures and 
the time older children talked with their 
fathers at home. These relations give sup- 
port to the first hypothesis about depend- 
ence, that dependent children are motivated 
toward more interactions with teachers 
because, in these particular relations, of 
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gratification received during more time spent 
with father. 

The consistent negative relations obtained 
for younger children suggest that this ex- 
planation does not apply to the excessive 
dependence shown by the more dependent 
younger children, Results for both groups 
substantiate the implication, induced from 
age differences in relations between depend- 
ence and home experiences with dramatic 
play topics, that causes of dependence may 
differ with the age of the child. 

The time mothers spent in talk with chil- 
dren did not relate to measures of depend- 
ence. Average r’s ranged from —.13 to .23, 
and age and sex differences in relations 
were negligible. 

As children spent more time at home in 
talk with the maid, they showed more 
dependence on teachers. The exceptions to 
this finding were boys and girls in the 
44-54 year group, as is shown in Table 41, 
who have been described often in this report 
as differing in relations with home experi- 
ences from the children of other ages. The 
relations for dependence measures in most 
age groups are in line with other findings 
that more talk with the maid at home fos- 
tered children’s frequent use of behavior 
not essential to getting along with peers 
at preschool. 


TABLE 41 


AVERAGE CORRELATIONS BETWEEN MEASURES OF DEPENDENCE ON TEACHERS AND TIME TALKED AT 
HOME WITH FATHERS, SIBLINGS, AND MAIDS FOR THE AGE AND SEX GROUPS IN WHICH 
DIFFERENCES IN CORRELATIONS WERE OBTAINED 


Fathers’ time Maids’ time Siblings’ time 


Dependence measures 


24-44% | 4M4-6)4 | Allbut | 414-514 


years years 416 years years Girls Boys 
Suggestion from child to teacher —.14 .19 Ads .09^ —.02 42 
Agreement by child with teacher .03 -26 15 —:03* +42 12 
Suggestion from teacher to child — .19 43> .35*b —.17° —.345 .18^ 
Agreement of teacher with child TM) .18 .39* — .08 2:01 .06 
Interactions with teacher —.12 .20 "195 — 227 —.02 21 


à Differences for 434-534 year boys were significant at .05 level, but not those for girls. 
b Age or sex difference in average r's is significant at .05 level. 


* Significant at .05 level. 


cated rejection by 
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If time spent talking with the maid indi- 
or deprivation of mother 
relations could be inter- 
preted as indicating support of the first 
hypothesis. However, the time spent talk- 
ing with maids correlated. positively with 
time spent talking with mothers, and did 
not relate to time spent talking with fathers, 
as was reported earlier. Maid's time cor- 
related negatively with the time spent talk- 
ing with siblings, a relation more in line 
with the second than the first hypothesis. 
More time spent talking with maids and 
less time spent talking with siblings may 
mean fewer opportunities to learn the tech- 
niques and interests required for participa- 
tion in play with peers. 

This interpretation is not verified by the 
relations between time spent talking with 
siblings and dependence measures. Most of 
these correlations approach zero in size, and 
the only consistent trend in age and sex 
groups was the difference in the direction 
of relations for the two sex groups shown 
in Table 41. 


Time for Stories, Records, and Tele- 
vision. When children spent more time 
viewing television at home, they showed 
less dependence on the teacher at preschool. 
Negative correlations were obtained for 
most age and sex groups. The size of the 
average 7's for all children was large enough 
to be significant in two relations for minutes 
watching television : with number of friend- 
ly interactions. with teachers (—.27), and 
with the frequency of suggestions from 
teacher to child (—.33). Additionally, for 
boys, the average r of — 32 was significant 
between minutes watching television and 
agreement by the child with the teacher. 

Television is a major source of children's 
information and interest in cowboys and 
other western lore, а frequent topic of pre- 
School play. For that reason, these relations 
probably offer the same kind of support to 
the second hypothesis about dependence as 
those found for home opportunities to learn 
about the dramatic play topics of the pre- 
School group. 

No relations existed between dependence 
measures and either time for stories or time 


or father, these 
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for records at home. Fifteen of the 20 sex 
group correlations were in the range from 
.00 to + .10, and there were no indications 
of age differences in these relations. 

Education of Parents. Measures of de- 
pendence on teachers did not correlate 
significantly with years of education of 
either parent for all children. However, age 
differences in relations between education 
of mothers and dependence measures were 
suggested by all correlations between these 
measures, and were significant at the .05 
level in two instances. Children younger 
than 44 years showed less dependence on 
teachers when their mothers had more years 
of education (average rs from —.21 to 
—,34), Children older than 44 years showed 
more dependence on teachers when their 
mothers had more years of education 
(average 75 from .13 to .29). 


Summary 


Relations for dependence measures pro- 
vided strong support for а hypothesis about 
preschool dependence not previously tested 
in research, The hypothesis is as follows: 
when children have too few experiences at 
home that provide the techniques and inter- 
ests required for participation in play with 
peers, they will often fail in their attempts 
to play, and, as a consequence, will show 
excessive dependence on teachers. 

As the children in this investigation had 
fewer home experiences with the dramatic 
play topics of their preschool group, they 
had fewer friendly interactions with peers, 
and they showed more dependence on 
teachers at preschool. Additionally, the 
more dependent boys used dramatic play 
hostility less frequently with peers, and 
more frequently used reality language, а 
behavior reported earlier as not essential 
to getting along with peers. Age differences 
in some of these relations suggested that 
this hypothesis may explain more instances 
of excessive dependence during the years 
from 24 to 44, when dependence is at a 
high level for all children, than during the 
later preschool years when most children 
are relatively independent of teachers. The 
home sources of information most clearly 
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related to dependence were talk with father, 
talk with mother, talk with children, and 
personal experience. 

Other relations suggestive of support 
for this hypothesis were (a) time viewing 
television at home correlated negatively 
with dependence, (b) time spent talking 
with maid at home correlated positively with 
dependence, and (c) negative relations 
occurred between mothers' scores on the 
Overpossessiveness PARI scale and both 
the home experiences with dramatic play 
topics of younger children and the depend- 
ence on teachers of younger boys. 

However, relations for dependence meas- 
ures also provided some support for hypoth- 
eses that suppression and deprivation, and 
overpossessive and gratifying parents moti- 
vate the child to seek dependence on teach- 
ers, and, as a consequence, the child lacks 
motivation and skill for interactions with 
peers. Girls showed more dependence on 
teachers when their father had higher scores 
on PARI scales for suppression and depri- 
vation. Because of the small size of most 
correlations, this support of the hypothesis 
cannot be described as strong. Younger 
boys showed more dependence when their 
mothers agreed more with the Overposses- 
siveness scale of the PARI (interpreted also 
as support for the other hypothesis). Older 
boys showed more dependence when they 
had the gratification of more time spent 
talking with fathers at home. 

The developmental decline in dependence 
on teachers was greatest between 24 and 44 
years, but continued through 64 years. The 
reality classification of language was used 
in most dependence interactions. Depend- 
ence hostility was rare. 

Dependence related differently to the edu- 
cation of mothers and to the child's vocabu- 
lary age for older and younger children. 

Dependence on teachers failed to relate 
to most children’s sociometric scores, use of 
reality hostility, time spent talking with 
mothers and siblings, time spent listening 
to stories and records, and fathers’ educa- 
tion. Dependence relations for 44-54 year 
boys and girls often differed from those for 
all other children. 


Discussion 


To many readers, the most important of 
the many relations demonstrated in this 
investigation are those indicating that when 
parents and adults talk with the preschool 
child about more of the topics the child | 
can use in play with other children, the 
child talks about and plays these topics more 
frequently with peers, and has a better 
chance of social acceptance in the preschool 
group. These relations suggest "positive" 
ways to induce socially accepted and, hence, 
"desired" behavior in children. Knowledge 
of such relation is of primary importance 
to the professions devoted to children's 
behavior problems, as well as to the pro- | 
fessions concerned with the personality 
development and education of *nonproblem" 
children. 

In establishing these relations, this inves- 
tigation delineated a new dichotomy of 
observable behavior in children: the use of 
dramatic play language and hostility, and 
the use of reality language and hostility 
during play with other children at preschool. 
The implications of the findings that these 
are different and often opposing variables 
may be of greater importance to present 
knowledge of children's behavior than the 
relations described in the first paragraph. 

Most present knowledge of preschool 
children's social behavior is the result of 
investigations that used measures combining 
behavior in these two classifications. When 
measures from the two classifications were 
combined in this investigation to form a | 
supposedly more general characteristic of | 
Social behavior, such as the number of : 


f 


TNT 


friendly interactions with peers, the rela-.- 

tions for the so-called general measure | 
depended on those obtained for the two 
classifications. Sometimes the correlations 
fell about midway between the opposing 
relations for the two classifications; in 
other instances correlations resembled the 
larger of those obtained for the classifica- 
tions. More knowledge of children's be- 
havior was obtained from relations for the 
two classifications, than from relations for 
measures combining behavior in these classi- * . 
fications. These findings raise questions ' 
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about the meaning of results of earlier 
studies that used measures combining the 
dramatic play and reality dichotomy. At the 
same time, they suggest a new reason for 
- the often reported contradictory evidence. 
Other findings about this dichotomy sug- 
gest there may be other errors in present 
conceptions of “general” categories of pre- 
school children's social behavior. Char- 
acteristics of children's behavior long ac- 
cepted as being opposite in meaning, such 
as friendly and hostile behavior and ascend- 
ant-submissive behavior, were not so in 
fact when given the additional classification 
| of the dramatic play and reality dichotomy. 
The children who were most often friendly 
< during dramatic play were the children most 
often hostile to others during dramatic play, 
and similar positive relations were found 
within the reality classification. Neverthe- 
less, the children most often friendly during 
dramatic play were not the children who 
were most often friendly in reality talk 
w with peers, and the children who displayed 
the most hostility during dramatic play 
were seldom the children who showed the 
2. most reality hostility to peers. In the 
dichotomy for ascendance (makes sugges- 
tions) and submission (agrees with others), 
the children who made the most suggestions 
during dramatic play were the children who 
agreed most frequently with others during 
dramatic play, and they were not the same 
children as those who most frequently made 
suggestions and agreed with others in reality 
talk with peers. Hence, findings of this 
investigation raise questions about the mean- 
ing of results of studies using measures that 
distinguish either friendly and hostile be- 
havior, or suggestion and agreement, with- 
out having made a distinction between 
dramatic play and reality play. 

Two questions must be answered by any 
research claiming to have demonstrated a 
new variable. One is: why has this variable 
not been identified in earlier studies. A 
partial explanation in this instance is sug- 
gested by trends in use of research tech- 
niques in child development. The dramatic 
play and reality classifications of this study 
4 were based on frequency counts from time 

sampling observation records of children’s 


behavior in a life situation. This method 
of collecting data was developed and used 
frequently in the 1930s. The method and 
findings of many studies done in these 
years on children’s social development, such 
аз Parten’s (1932) classification of the 
development of social participation, and 
Anderson’s (1939) study of domination and 
socially integrative behavior, are close pred- 
ecessors of the child behavior aspects of this 
study carried out 20 years later in time. 
Had this method of data collection con- 
tinued to be popular, perhaps these variables 
would have been identified 15 years ago. 
From 1940 to the present, however, this 
time-consuming method of observation has 
been replaced in most research on children's 
social development by the less time-consum- 
ing estimates of real life behavior: rating 
scales for use of observers and teachers, 
and tests or experimental situations of ag- 
gression, frustration, dependence, etc. The 
dramatic play and reality variables of this 
investigation could not be identified by use 
of these data collection methods. 

The other question is: Are there earlier 
investigations that, by hindsight, report 
findings similar in any way to those of the 
present investigation? The dramatic play 
classification of this investigation was de- 
fined as children’s use of language and hos- 
tility to carry out the roles of dramatic play 
in preschool groups. One of the best known 
investigations in child psychology, Baldwin’s 
(1948, 1949) study about the effect of home 
environment on nursery school behavior, 
presents evidence of relations between home 
experiences and the dramatic play of chil- 
dren in preschool. An estimate of the 
degree to which children engaged in dra- 
matic play was required for one of the 45 
rating scales for nursery school behavior 
used in Baldwin's study. The means for the 
dramatic play scale were those selected for 
use in the often published diagram illustrat- 
ing the association between children’s be- 
havior and democratic, indulgent, and warm , 
patterns of parent behavior. Dramatic play 
scale means were higher when parental 
patterns were rated as democratic, but 
were not associated with the two parental 
patterns of indulgence and warmth. There 
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is no other evidence in this or other Fels 
Research Institute reports, however, that 
Baldwin and coworkers thought the dra- 
matic play variable might warrant more use 
in future investigations than the other 44 
child behavior scales they studied. 

The evidence of the present investigation 
suggests that the most promising child 
behavior measures of this study for use 
in future research on positive parent-child 
relations are those in the dramatic play 
classification of children's use of language 
and hostility. Both the dramatic play and 
reality classifications have promise for 
studies of negative influence of parents on 
children's behavior. 

The home experiences that had the largest 
and most consistent correlations with de- 
sired aspects of children’s behavior at pre- 
school were experiences that are usually 
described as intellectual, rather than as 
affectional or socially stimulating. These 
experiences exposed children at home to 
information about the dramatic play topics 
of the preschool group, described in this 
report as including most aspects of the 
child’s environment. Positive correlations 
in the .40 to .60 range were obtained be- 
tween these experiences and children’s use 
of dramatic play language and hostility, the 
number of their friendly interactions with 
peers, and social acceptance in the preschool 
group. The inference is inescapable that 
these home experiences were affectional and 
socially stimulating as well as intellectual. 
This conclusion is in line also with findings 
that measures of home experiences with 
dramatic play topics involving either talk 
with parents and adults or personal experi- 
ence had larger correlations with desired 
social behavior than measures of informa- 
tion about the topics gained through books, 
records, and television. 

Evidence has been presented throughout 
this report that the girls in this investigation 
were handicapped in their coeducational 
preschool world by the fact that their par- 
ents and those of boys were following 
generally accepted practices and attitudes 
about rearing boys and girls differently. In 
the relations for the parent practices de- 
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scribed in the preceding paragraph, corre- 
lations were larger for girls, but they 
averaged significantly fewer opportunities 
to learn about the dramatic play topics at 
home than boys, and they used dramatic 
play language and hostility less frequently 
and had fewer social interactions than boys, 
In other words, girls were given fewer of 
these socially stimulating experiences, yet 
girls were more affected by differences in 
these experiences than boys. 

The sex difference in experience with 
dramatic play topics seemed to be that boys' 
parents talked about and provided more 
experience with so-called "men's work," 
such as construction, destruction, and cow- 
boys, than girls' parents. This difference is 
not justified by consideration of their future 
occupations; e.g., the boys are not likely to 
grow up to be cowboys, the girls may need 
to know more about house construction than 
the boys, and the destruction in the wars 
these children may encounter is thought to 
be of as great concern to women as to men. 
All children were given much experience 
with “women’s work,” or home and family 


given experiences thought to foster their 
easy acceptance of the female role and 
identification with mothers, while the boys 
were given opportunities to learn roles of 
both sexes. The evidence of this investiga- 


| 


| 
| 
situations. In other language, the girls were | 
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tion suggests that the emphasis on future 
sex role differences handicapped the pre- 
school girls in their actions at preschool. 

It is not assumed that it would be easy 
to remove from preschool girls the limita- 
tions of exposure to information about 
their environment that is not related to 
home life. It would mean a fairly drastic 
revision of attitudes about the interests of 
preschool girls. To illustrate this point, 
would you think of asking a little girl to 
go on a trip to see the construction of a 
house or highway as readily as you would 
think of asking her to go see a super doll 
house on display in a store? In visiting her 
at home, would you ask the little girl to 
show you her toy trucks and airplanes 
before you asked her to show you her dolls? 
And which would you think of first for a 
little boy, who cannot fail to also have 


| 
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frequent exposure to home and family 
topics? 

The same kind of implication can be 
drawn from some relations for parents’ 
attitudes measured by the PARI scales, the 
other home experience that related closely 
to children's preschool behavior. PARI 
scores indicated that parents of boys be- 
lieved their child should have greater free- 
dom of expression than parents of girls. 
This fnding conforms with generally 
accepted opinion about rearing boys and 
girls. Differences in this parent attitude 
did not interfere with the high level of 
boys’ use of dramatic play language and 
hostility or with their social acceptance in 
the preschool group. Girls, however, ex- 
pressed ideas less frequently in dramatic 
play and were less popular among peers 
when their fathers agreed more with the 
ideas of the suppression scale. These girls 
gained more than the boys from encourage- 
ment to freely express ideas, but attitudes 
about sex role differences apparently limited 
their exposure to this beneficial attitude. 

The need for talk of adults to be specifi- 
cally oriented to children’s interests, if it is 
to accompany desired behavior of children 
in preschool groups, was given emphasis by 
the relations for measures of time spent 
talking with individuals at home. More time 
spent talking with persons at home did not 
accompany desired behavior for the child in 
peer groups, but accompanied undesired be- 
havior, Additionally, time spent talking 
with adults did not relate to the percentage 
of dramatic play topics discussed with adults 
in five of the six possible correlations. 

The narrow range of socioeconomic status 
for these families did not prevent the dem- 
onstration of relations between children’s 
social behavior and either home experiences 
with dramatic play topics or parents’ scores 
on the PARI. Both of these types of home 
experience are known to vary with socio- 
economic status (McCarthy, 1954; Schaefer 
& Bell, 1958; Zuckerman, Barrett, & Bra- 
giel, 1959). Socioeconomic status was соп- 
trolled’ in this study to see if relations 
existed between home experiences and chil- 


lidren's behavior that were independent of 


the descriptive but nonexplanatory socio- 


economic status classification. The occur- 
rence of relations for these two types of 
home experiences indicates that they war- 
rant further study as possible causes of 
differences in children’s social behavior with 
peers. 

There is evidence in this study that chil- 
dren’s use of language with peers does not 
depend on children’s knowledge of words 
on the Stanford-Binet Vocabulary test. 
Home experiences selected on the basis of 
evidence of relations with vocabulary and 
other aspects of language development did 
not relate in most instances to the vocabu- 
lary age of these children. The latter find- 
ings are not in contradiction to earlier 
studies of language development because of 
the control of socioeconomic status. Rather, 
the evidence indicates that home experiences 
found in earlier studies to relate to language 
development may relate to children’s social 
use of language regardless of relations with 
language development and socioeconomic 
status. 

The idea that children may show behavior 
that does not increase or decrease with age 
is new to child development professional 
workers (Harris, 1957), and has not been 
given additional meaning through many 
investigations. In this investigation, the 
child behavior classified as reality use of 
language and hostility did not change as 
age increased. This child behavior appeared 
to fluctuate from day to day, and from week 
to week. Relations obtained for the reality 
classification suggest that frequent use of 
these categories of language and hostility 
may be an indication of personal or emo- 
tional difficulties, a meaning suggested also 
by Harris' results. 

The frequency of use of reality hostility 
in play with peers is suggested by the find- 
ings of this investigation to be the most 
sensitive (enters into more relations with 
other variables) to personal difficulties of 
the measures in the reality classification. 
The language measures in the reality classi- 
fication correlated positively with use of 
reality hostility, but related to fewer child 
and parent variables. Children’s use of hos- 
tility to establish or maintain their personal 
rights had closer positive relations with 
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possible causes of personal difficulties, such 
as punitiveness of parents, in this investiga- 
tion than the more general estimates of 
aggression or hostility used by other investi- 
gators (e.g, Sears et al., 1953). However, 
present evidence is insufficient to judge 
whether the frequent use of reality hos- 
tility indicates personal difficulties better 
than infrequent use of dramatic play lan- 
guage and hostility; the dramatic play 
measures had negative relations with pos- 
sible causes in parents' behavior. 

Psychological hypotheses that punishment 
leads to aggression were given support by 
relations between parents' scores on puni- 
tive control scales of the PARI and all 
measures of hostility and aggression used 
in this study. However, hypotheses that 
anxiety about expressing aggression will 
prevent its displacement to nursery school 
peers, but will not prevent its displacement 
to a doll play situation were not supported 
by results of this investigation. These 
hypotheses were developed and supported 
by the findings of Sears and students ( Hol- 
lenberg & Sperry, 1951; Sears et al., 1953). 
In their investigation, ratings of mothers' 
punitiveness related in opposite directions 
for boys and girls to ratings of their aggres- 
sion at preschool, and correlations were of 
moderate size. However, both boys and 
girls expressed more aggression in a doll 
play situation when their mothers had 
higher ratings for punitiveness. 

In the present investigation, large positive 
correlations were obtained between both 
parents’ punitive control scores and the 
display of reality hostility to peers at pre- 
school by both boys and girls. Correlations 
between parents’ scores and children's 
aggression scores from a doll play test of 
frustration were moderate in size, and the 
direction of relations with fathers' scores 
differed for boys and girls. These findings 
are a direct contradiction of those reported 
by Sears and students and of their displace- 
ment hypothesis. 

The only common meaning for measures 
of children's hostility at preschool and their 
aggression during doll play in this investiga- 
tion were the positive relations with the 
punitive control scores of parents. Test 


aggression scores failed to relate to observed 
behavior of children at preschool and to 
other aspects of home experience. Conse- 
quently, the results of the present investiga- 
tion disagree with almost all aspects of the 
assumption of the importance of aggression 
during doll play followed by Sears and 
coworkers in studies subsequent to those 
reported in 1950 and 1953 (see introductory 
statements by Levin & Sears, 1956, pp. 
135-136). 

The difference in relations obtained for 
preschool behavior in the two investigations 
was explained as probably due to differ- 
ences in procedure in the section of this 
report on relations for PARI scores. The 
largest correlations of the present study were + 
for the measure of use of reality hostility, 
and the reality-dramatic play distinction is 
unique to this investigation. Fathers were 
included as subjects in the present investiga- 
tion and their punitive control scores had 
larger correlations with children's hostility 
at preschool than mothers' scores. The child 
sample of this investigation had a 4 year 
age range, and the differences in relations 
for 44-54 year boys could be recognized, 
and hence, not distort findings, as may have ^ 
occurred in the Sears et al. study. 

The differences in relations obtained for 
doll play behavior may be due also to pro- 
cedural differences. The doll play situation 
in the present investigation was controlled 
presentation of specific equipment and dolls, 
rather than the uncontrolled use of a family 
of dolls used by Sears and coworkers. 
Behavior obviously may differ in such dif- 
ferent circumstances. This possibility needs 
exploration in research before consideration 
is given to ideas that the antecedents of. 
aggression during doll play situations may 
only occasionally resemble the antecedents 
of children's aggression in life situations. __ 

It has been often mentioned in this report ` 
that relations for 44-54 year boys differed 
significantly from relations obtained for 
older and younger boys. This age difference 
in relations has been reported also for girls, 
particularly in relations for dependence 
measures. The age differences in relations 


occurred most frequently for relations be- ? 


tween hostility or dependence measures and 


д 
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. home experiences, but were not limited to 


ў these child behaviors. This age group of 


P 


boys had higher scores on all three measures 
of hostility than other boys and girls, and 
girls of this age had higher scores for use 
of dramatic play hostility, but not the other 
two measures, than other girls. 

The excessive hostility and the failure of 
preschool behavior to relate to home experi- 
ences that related to such behavior in older 
and younger children suggests that the chil- 
dren in this age year, particularly boys, 
were in the throes of some big problem. 
But what was their problem? This investi- 
gation cannot answer that question. The 
specific age differences in relations suggest 
some limitations of its area, and give some 
support to Freud’s Oedipus and Electra 
complexes as possible explanations. 

First, this age is that suggested by Freud 
as the time of attempts at solution of the 
Oedipus and Electra complexes. This is a 
big enough problem to result in the behavior 
shown by these children. 

Second, the age differences for these chil- 
dren probably were cultural or develop- 
mental in origin. There were boys from this 
age year in all five preschool groups ob- 
served in 1957, and they were all typical 
examples of this age group of boys. As 
was mentioned in the developmental differ- 
ences section, the relatively placid 54-64 
year boys in this study had attended these 
preschools in the preceding year, and, by 
teachers’ reports, had then behaved in the 
same way as the 44-54 year boys of this 
investigation. 

Third, these children seemed to be 
troubled by a problem involving independ- 
ence of parents. Many of the differences 
in relations for 44-54 year boys were fail- 
ures to find evidence of parental influence 
that was demonstrated by fairly large cor- 
relations for other children. These atypical 
relations are best exemplified by the failure 
of the excessive hostility of the 43-53 year 
boys to relate to the punitive control scores 
of parents. Relations for dependence on 
teachers and for watching television suggest 
that the influence of other persons and situ- 
ations was greater for 41-51 year chil- 
dren than for the other ages. The 41-55 


year old showed more reality hostility to 
peers as dependence on teachers increased. 
They engaged in more conversation with 
peers and showed more dramatic play hos- 
tility when they had learned about more 
dramatic play topics through television. 
Neither of these relations existed for older 
or younger children. 

Fourth, the problem included relations 
with both parents for boys. The lack of 
relations between punitive control scores 
and 44-54 year boys’ use of reality hostility 
was found for both mothers' and fathers' 
scores. The failure at this age to find posi- 
tive relations between the percentages of 
dramatic play topics checked for home in- 
formation sources occurred for both infor- 
mation sources: talk with father and talk 
with mother. The inclusion of both parents 
in the problem does not conflict with Freud's 
hypotheses if they are interpreted to mean 
that rejection of both parents is a prerequi- 
site for identification with the father. 

Тһе behavior reported here for 44-54 year 
children needs replication and identification 
in other investigations to determine the 
problem of this age group and its meaning. 
The many differences warrant suggesting 
that this age year and the years on either 
side be included in studies of social behavior 
of preschool children until such time as 
more definitive evidence is available. Ac- 
cording to the evidence of this study the 
inclusion of other ages is particularly 
important for investigations of children’s 
hostility to peers and dependence on teach- 
ers. The evidence suggests also a need for 
caution in generalizing findings obtained for 
this age year to the other preschool years. 

This study has provided both new and 
corroborative evidence about children’s de- 
pendence on adults other than parents, a 
characteristic that has been given fairly 
extensive research and theoretical explora- 
tion in the past 15 years. Dependence of 
children was shown to be a characteristic 
so closely associated with small age differ- 
ences as to suggest that age needs more 
careful control in the study of dependence 
than has been given in earlier studies. 

A hypothesis about a home cause of pre- 
school children’s dependence on teachers 
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was elaborated and given its first research 
test in this investigation, It was strongly 
supported by relations obtained in life situa- 
tions. Correlations for behavior in life situ- 
ations cannot demonstrate that a proposed 
cause is such in fact, but they justify devis- 
ing experimental tests of these relations. The 
“cure” aspects of this hypothesized cause, 
a lack of home experience with many dra- 
matic play topics of the preschool group, 
can be given experimental tests similar to 
those devised by Gewirtz, 1954) and Hartup 
(1958) for another hypothesized cause of 
dependence, suppression and deprivation. 
Sex differences obtained in relations be- 
tween measures of dependence and PART 
scores agree with findings and hypotheses 
of others (McCandless et al., 1961; Sears 
et al., 1953) that parent behavior may relate 
differently to dependence of boys and girls. 
The sex differences in relations represent 
two explanations advanced for dependence: 
(a) that suppression and deprivation foster 
dependence, as was found here for girls in 
relations with parents’ suppression and puni- 
tive control scores; and (b) that extremely 
gratifying parents may develop habits of 
dependence in their children (Crandall et 
al., 1960; McCandless et al., 1961), as 
was found here for boys in relations with 
mothers’ overpossessiveness scores. 
Relations demonstrated in this research 
contribute new knowledge about factors 
affecting social acceptance among preschool 
children, although this was not a purpose 
of the investigation. Parents' attitudes were 
shown to be factors relating to the socio- 
metric choice of their sons and daughters 
by other children at preschool that did not, 
at the same time, relate to the number of 
their child's friendly interactions with peers. 
This finding is the first evidence, aside from 
sex differences, to account for the discrep- 
ancy between children's sociometric scores 
and their observed participation with peers 
that has been reported by all investigators 
(Marshall, 1960). The relations indicated 
that girls who were encouraged to express 
ideas freely at home were more popular 
among peers. For boys, however, popularity 
among peers was greater when parents 
believed that some restraints should be 


placed on their expression of ideas. The, 
Írequency of children's use of dramatic * 
play language and hostility was shown to | 
be as good a predictor of children's socio- 
metric scores as the number of their friendly 
interactions with peers, and this was, of 
course, a completely new finding. Also new 
was the finding that the frequency of chil- 
dren's talk as themselves did not relate to 
their sociometric scores. This investigation 
verified an earlier report (Marshall & Mc- 
Candless, 1957b) that children do not base 
their likes and dislikes on the hostility dis- 
played by peers, but placed some limits on 
this generalization. It was found to apply 
to all types of hostility for girls, but to be. 
limited for boys to the hostility they dis- 
played in their own behalf. Boys apparently 
had to show hostility in carrying out the 
roles of dramatic play to be popular. 

The children of this investigation were 
younger than those used as subjects in most 
investigations of the effects of television. 
The two measures of their use of television 
entered into more relations with their behav- 
ior than comparable measures for use of 
books and records. The findings suggest, 
however, that television is not a major 
influence on preschool children’s behavior 
with age peers, and hence, agree with the 
findings of a study of 10-14 year old chil- 
dren in England (Himmelweit, Oppenheim, 
& Vance, 1958). Its possible influence on 
these young children appeared to be in 
desired rather than in undesirable direc- 
tions: e.g., the measures related positively 
to children’s vocabulary age, and negatively 
to their use of reality hostility and their 
dependence on teachers. Relations for these 
children, then, were not in line with 
suppositions that children will show more 
undesirable behavior as their exposure to 
available programs on television imereasos. 
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This investigation explored relations be- 
tween children's behavior with peers and 
teachers at preschool and several aspects 
of home experience. Subjects were 108 
children, aged 21-63 years, 101 of their * 
mothers, and 101 of their fathers, Families ' 


e within upper levels of socioeconomic 


_ The social behavior measures that related 
i frequently to home experiences were 
srvation measures developed in this in- 
igation for two classifications of chil- 
dren’s use of language and hostility with 
peers at preschool: use of language and 
tility to carry out the roles of dramatic 
; and use of language and hostility in 
ity talk and play as themselves. Lan- 
‘guage and hostility measures within each 
“classification were positively related, and 
| either no relations or negative relations were 
found between these classifications. 


e difference in meaning of the dra- 
atic play and reality classifications of 
children’s use of language and hostility with 
eers is shown in the following lists of 
lations for both classifications. The fre- 
quency of children’s use of dramatic play 
| age and hostility with peers : 
. Was reliable over time (7’s from 76 


2. Increased with age 
— 3. Was greater for boys 

Mes Increased as ‘social acceptance іп the 
‘group increased 

5. Increased as the number of friendly 
nteractions with other children increased 
- 6. Increased as home experiences with 
he dramatic play topics of the preschool 
"p increased 

—— 7. Decreased as children spent more time 


). Decreased as fathers, but not mothers, 
:d more with the PARI scale of Over- 
ssiveness Ы 

. 10. Decreased for boys, and did not 
chang for girls as dependence on teachers 
‘Pt preschool increased 

| Tn contrast, the frequency of children’s 
Use of reality language and hostility with 
peers; | 

Changed greatly over time (r's from 
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2. Did not change as age increased 

3. Did not differ for the two sexes, except 
that hostility was used more by boys 

4. Did not relate to social acceptance in 
the group 

5. Did not relate to number of friendly 
interactions with other children, except in 
a few instances for girls 

6. Either decreased or did not change as 


‘home experiences with the dramatic play 


topics of the preschool group increased 

7. Increased as children spent more time 
talking with the maid at home 

8. Increased as either parent agreed more 
with PARI scales emphasizing suppression 
and punitive control of children (use of 
reality hostility increased markedly as both 
parents agreed more with the scales) 

9, Either increased or did not change as 
parents agreed more with the РАКІ scale 
of Overpossessiveness 

10. Increased (excepting hostility) as de- 
pendence on teachers at preschool increased 

These relations suggest that children's 
frequent use of dramatic play language and 
hostility with peers: (a) is essential to 
getting along with age peers at preschool, 
(b) is desirable behavior for children, and 
(c) depends upon wide informational ex- 
perience for the children at home and upon 
their parents' attitudes about suppression, 
punitive control, and overpossessiveness. 
They suggest also that children's frequent 
use of reality language and hostility with 
peers: (a) is not essential to getting along 
with peers at preschool ; and (b) indicates 
difficulties at home for the child, such as 
punitive, demanding, suppressive, or over- 
possessive parents, or too much time spent 
in talk with a maid. 

Tt was concluded from relations within, 
between, and for the dramatic play and 
reality classifications that these classifica- 
tions were different variables of children’s 
social behavior, Because these classifications 
and their relations have not been described 
previously, and because neither classification 
related to the vocabulary age of the child 
(from a test in the area of language devel- 
opment that has been well explored), it was 
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concluded also that these two classifications 
were "new" variables. 

Measures combining these two classifica- 
tions, such as the number of friendly or 
hostile interactions with other children, and 
the measures of all aggression, submission, 
and dominance, did not relate to as many 
aspects of experience. The size and direc- 
tion of relations for the combined scores 
were those to be expected from: (a) the 
opposing relations found for dramatic play 
and reality measures, (b) the relative pro- 
portions of dramatic play and reality meas- 
ures in the combined measure, or (c) a 
difference in the variability of the dramatic 
play and reality measures in the combined 
measure. 

The two aspects of home experience that 
related closely to many aspects of children's 
preschool behavior and to other aspects of 
home experience were: experiences that 
provided information about the dramatic 
play topics of the preschool group, and 
parents' scores оп four composite scales of 
the PARI. 

As children were given information about 
more dramatic play topics of the preschool 
group from eight home sources of infor- 
mation : 

1. Their use of dramatic play language 
and hostility with peers increased. 

2. The number of their social interactions 
with peers increased. 

3. Their social acceptance in the group 
increased. 

4. Their dependence on teachers de- 
creased. 

5. Their use of reality language and hos- 
tility either decreased or did not change. 

6. The vocabulary age of boys, but not 
of girls, increased. 

7. Parents of girls, but not of boys, dis- 
agreed more with the suppression and puni- 
tive scales of the PARI. 

8. Younger children's mothers disagreed 
more, while older children's mothers dis- 
agreed less with the Overpossessiveness 
scale of the PARI. 

Opportunities to learn about more dra- 
matic play topics through talk with parents 
and other adults important to the child had 


the largest correlations with measures of 
children’s behavior. It was concluded that 
if parents and adults talk with the prescho 
child about more of the topics the child 
can use in play with other children, the child 
will talk about and play these topics more 
frequently with preschool peers, and willi 
have a better chance of social acceptance in 
the preschool group. 
As parents, particularly fathers, agreed 
more (or disagreed less, the more accurate 
description for these parents) with the sup-- 
pression and punitive control scales of the 
PARI: | 
1. Their children more frequently showed: 
reality hostility to peers, and, to a lesser 
extent, hostile interactions with peers ій 
creased, and sons' use of dramatic play. 
hostility increased. E 
2. Their sons’ aggression scores on a doll 
play test of frustration increased, and this 
relation was found also for girls’ and 
mothers' scores. 
3. Their daughters’ popularity in the pre 
school group decreased, but their sons 
popularity increased. | 
4. Their daughters less frequently usi 
dramatic play language, but this use of 
language increased for sons. 
5. Their children more frequently used 
reality language with peers. j 
6. Their daughters, but not sons, showed 
increasing dependence on teachers. с 


y 


with dramatic play topics decreased, but 
these experiences either increased or di 
not change for sons. 

8. Their years of education beyond hi 
school decreased. 

Relations for parents’ scores on the Over 
possessiveness scale of the PARI differed 
with the age and sex of the child and with 
the sex of the parent. As fathers, but not 
mothers, agreed more with this scale, their 
children’s use of dramatic play hostility de- 
creased, and the number of boys' hostile 
interactions with peers decreased. As moth 
ers, but not fathers, agreed more with this 
scale, younger boys showed more depend- 
ence on teachers while older boys showed" 
less dependence, and younger children and: 
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all boys had fewer opportunities to learn 
about dramatic play topics from talk with 
both parents. 

Minor differences in parents PARI 
scores, particularly fathers' scores, were 
associated with large differences in chil- 
dren's behavior. It was concluded that par- 
ents' attitudes affect both desired and un- 
desired aspects of their child's behavior 
with peers. 

The time family members spent in talk 
with the child seldom related to children's 
behavior or to other aspects of home experi- 

y ence. However, as children spent more time 
in talk with the maid at home, their use of 
reality language with peers increased, their 

y “use of dramatic play language and hostility 
decreased, and their dependence on teachers 
increased, 

The time children spent at home listening 
to stories and records and watching tele- 
vision seldom related to their behavior at 
preschool. 

f Few differences in child behavior and 
home experience were associated with dif- 
ferences in years of education of parents; 

AN inis was described as probably due to the 
control of socioeconomic status in this 
investigation. 

Relations for measures of children’s de- 
pendence on teachers provided strong sup- 
port for a hypothesis about preschool 

|| dependence not previously tested in re- 

search: when children have too few experi- 
ences at home that provide the techniques 
and interests required for participation in 
play with peers, they will often fail in their 
efforts to play, and, as a consequence, will 

Show excessive dependence on teachers. As 

he children in this investigation had fewer 

home experiences with the dramatic play 
topics of their preschool group, they had 

4 fewer interactions with peers, and they 

Showed more dependence on teachers at 

preschool. Age differences in some relations 

suggested that this hypothesis may explain 
more instances of excessive dependence dur- 
ing the years from 2} to 44, when depend- 
ence is at a high level for all children, than 

during the later preschool years when most 

à Ëhildren are relatively independent of 

teachers. 


Relations for dependence measures also 
provided some support for hypotheses that 
suppression and deprivation, and overpos- 
sessive and gratifying parents motivate the 
child to seek dependence on teachers and, 
as a consequence, the child lacks motivation 
and skill for interaction with peers. There 
were sex of child differences in relations 
supporting the two hypothesized causes of 
dependence motivation. Girls showed more 
dependence on teachers when their fathers 
had higher scores on PARI scales for sup- 
pression. Younger boys showed more de- 
pendence when mothers agreed more with 
the PARI Overpossessiveness scale, and 
older boys showed more dependence when 
they had the gratification of more time spent 
talking with fathers at home. 

Two test scores of these children failed 
to relate to most of the preschool and home 
experience measures of this investigation. 
Children’s vocabulary age on the Stanford- 
Binet Vocabulary test did not relate to 
children’s use of language with peers, or to 
most other child behavior and home experi- 
ences. Children’s aggression scores on a test 
of frustration failed to relate to their ob- 
served hostility to peers and to any home 
experiences other than the punitive control 
scores of the PARI. It was concluded that 
these test scores estimate different child 
behavior than that observed with peers at 
preschool in this investigation. 

Age and sex differences seldom occurred 
in relations between the measures of social 
behavior with peers for these children aged 
24-64 years. As the summaries indicate, 
sex differences were frequent in relations 
between these measures and estimates of 
home experiences, and in relations for de- 
pendence on teachers. Age differences in 
these two types of relations also were found 
frequently. In relations for home experi- 
ences, 44-54 year boys often differed from 
boys and girls of other ages. Age differ- 
ences in relations for dependence on teach- 
ers were found for two different divisions 
of these preschool years: (a) children 
younger than 44 years and children older 
than 44 years, and (b) 44-54 year children 
and children in all other age groups. 
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APPENDIX A 


Questions on Exposure To IpEAs IN PLAY FoR PARENTS 


What activities does your child watch you 
do at home? 


a. Elsewhere? 


b. Do you talk about what you are doing 
to the child as you do it? 


с. Does your child hear you talk about 
these activities : 


To other family members? 
Over the telephone ?. 
To visitors ?. 


Does your child share your meals and con- 
versation ? 


If so, is much of the talk “over his head"? 


Do you have a regular special time devoted 
to talking with your child, such as just 
after stories every day, or before bedtime, 
etc.? 


About how much time each day do you 
usually spend conversing with the child 
(during the week) ? 

Do you spend more time conversing with 
him on weekends ? 

How much more time? 


Does anyone working in your home, such 
as a maid, talk with him (her) while doing 
their work? 

About how much time do they spend talk- 
ing with the child (daily) 


(weekly). — 
Are there older brothers (. ) or 
sisters ( ) who talk with him 
(her) ? 


About how much time do they talk each 
day? (Each) 


7 


10. 


11. 


12. 


a. Do you read to the child, usually every- 


day ?_______ For how long ?. 

b. Does he (she) listen to records ?. 
How often ?. About how long 
each time? 


c. Does he (she) watch TV or listen to 
radio (underline which) ? 


How often ?. 
How long each time? 
Are there any regular programs? 


d. Has the child seen any movies (other, 
than on TV)? — What? —  — 


Through which of all the above possibilities 
(give card to parent) has your child been 
exposed to these concepts used before 
Christmas by the children in their make- 
believe play ? 

(Check answers on other sheets) 

(After Question 8) 


Are these people, activities and things the 
same as those your child seems interested. 
at home? 

Do these seem not too interesting at 
home ?. 


Are the omissions from the list of play 
that you would have expected to be in- s 
cluded because your child uses them so ^ 


much in make-believe play at home? 
Or in conversation ?. 


What are your child's favorite topics when 
he (she) talks with you? 


Are you trying to interest your child ite 
some particular activities or people? 
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BEREA COLLEGE Nursery SCHOOL Dramatic PLAY IDEAS 
Talk TV Personal| Talk of | Talk of 
with |Books| or | Movies | Records | experi- | other other 
you radio ence adults |children 


Policemen-bankrobbers 


Doctor-nurse-patient 


Cowboy-cowgirl 


Indians 


Badmen 


Roy Rogers and Trigger 


Island in an ocean 


Tunnel in a mountain 


Building road for trucks 


Building a castle, bridge 


Trains 


Airplanes and airport 


Putting on a play 


Fixing cars in a garage 


Painting a house 


Grocery store 


Hiding in a cave 


Witches 


Driving tractor—farm or garden 


Make a little creek 


. Dynamite—firecrackers 


Take animals to zoo 


House on fire—fireman 


Animals—growl and crawl 


Big bad wolf blows house down 


Tigers 


Lions 


Hound dogs 
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use. In spite of this progress and general 
enlightenment, tuberculosis remains a dis- 
ease to be highly dreaded and the public 
retains the image of the tuberculosis patient 
as someone to be avoided as a health threat. 

Hospitalization for the treatment of tu- 
berculosis creates many problems for the 
patient. Because it is a contagious disease, 
the patient must be isolated. For the patient, 
hospitalization results in his being uprooted 
from his usual environment for months on 
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end, often leaving the family without ade- 
quate income, and depriving the patient and 
his family of mutual emotional support. 
The hospital requires a certain amount of 
regimentation for smooth and efficient oper- 
ation, yet this tends to force a dependent 
role on the patient. Sanitary precautions 
needed to prevent spread of the disease are 
restrictive and unnatural to the patient and 
his family. 

The treatment process itself is also very 
frustrating. Progress is very slow. It is 
measured in months rather than in hours, 
days, or weeks as with other illnesses. The 
patient feels fine, practically from the start 
of treatment, yet must accept the fact that 
he is ill and needs to be kept isolated. The 
threat of possible lung surgery is constantly 
there. Surgery is considered a critical oper- 
ation, the need for which is often only 
partially understood and the possible results 
frequently considered not worth the risk 
involved. The patient can hardly help but 
wonder as he talks to some of the “old 
Chronic" patients on the ward whether he 
will really get well and stay well. The 
future which must be so far postponed may 
look far from promising as he is concerned 
with worries about holding his place at 
work, and about what relationships in the 
family and neighborhood will be like after 
he returns home. 

Tt is because of these "psychological" as- 
pects of pulmonary tuberculosis that psy- 
chologists have been assigned to wards 
where these patients are treated. Out of a 
common interest in these problems, a group 
of such psychologists in Veterans Admin- 
istration hospitals met in 1956 to formulate 
а coordinated research program designed to 
study the psychological aspects of the illness 
and its treatment. Subsequently the large 
scale cooperative study, which is the subject 
of this report, was developed. 

Because of its nationwide program of 
medical care, the Veterans Administration 
has been able to contribute significantly to 
the field of medical research through the 
medium of cooperative research programs, 
These have included projects involving sev- 
eral disciplines with simultaneous data col- 

lection in many of the 170 hospitals and 


potential advantage of having difference 
between institutions and other chance fac. 
tors tend to cancel each other out, thus pro- 
viding findings more representative of the 
total group of patients for whom the results 
may be applied. 

Tuberculosis, in particular, has lent itself 
readily to cooperative research. Large scale 
investigations have been carried out by the 
Veterans Administration and the Armed 
Forces (Transactions, 1959), the Unite 
States Public Health Service and the Britis 
Medical Council. Among these have bee 
many studies reporting the effectiveness o 
various combinations of drugs (Quarterl 
Progress Report, 1958). 

These investigations have led to many; 
advances in the treatment of a disease which 


threat of some residual degree of permanent 
disability. With such a background th 
application of cooperative procedures to th 
study of the psychosocial factors in the? 
treatment of tuberculosis was a natural 
development. 


Summary of the Literature 


Almost 100 years ago an article was р 
lished by Clouston (1863) in which 
maintained that there was a relationsh 
between tuberculosis and emotional disturb 
ance. During the next 80 years many hun- 
dreds of reports appeared, all imputi 
Some personality characteristic or other to. 
tuberculosis patients. 

Although these reports served as a ric 
source of hypotheses, they did not materially 
advance our knowledge. All too often ob- 
servations were made on a small number 
of cases; contradictory conclusions were | 
extremely common; subjective rather than 


objective evaluations were the predominant 
procedure ; and research designs adequately 
testing the various hypotheses were un- 
heard of.? 
During the past 20 to 25 years a number 
of experimental studies have appeared, test- 
ing a variety of hypotheses and advancing 
our knowledge in several ways. We no 
longer think of there being a "tuberculosis 
personality," thanks to the work of Derner 
(1953) and others. Differences between 
scores on psychological tests of tuberculosis 
P patients and non-ill people do occur (An- 
- dreychuk, 1954; Hand, 1952; Muldoon, 
1957; Page, 1947), but these are seen in 
the context of the patients’ experience of 
hospitalization. A number of studies report 
york on specific emotional characteristics 
such as depression (Derner, 1953) and 
anxiety (Derner, 1953; McElroy, 1950), 
"and enable us to conclude that these fre- 
quently are found in tuberculosis patients. 
Some relationship between psychological 
factors and response to treatment is strongly 
Ч ed by the work of Brotman (1955). 
tain behaviors can be predicted in tu- 
ulosis patients. Discharge against medi- 
cal advice can be predicted from demo- 
graphic data ( Moran, Fairweather, Morton, 
& McGaughran, 1955) апа from projective 
techniques (Calden, Thurston, Lewis, & 
Lorenz, 1956; Vernier, Whiting, & Melser, 
1955). Behaviors within the hospital can 
_be predicted from a sentence completion 
test specifically designed for tuberculosis 
patients (Calden, 1953; Thurston, Lorenz, 
& Calden, ZOT—C-68) and from question- 
fires (Moran et al., 1955; Pauleen, 1955). 
Little experimental work on posthospital 
behavior has been done, other than evalua- 
tion of rehabilitation. training programs 
- (Agur & Anderson, 1954; Armstrong, 1953; 
| Longton, Wagner, & Meier, 1950; Marion 
& Salkin, 1959; Warren, 1954) or of physi- 
ological recovery (Alling, Bosworth, & Lin- 
coln, 1955; Guest, 1943). 


*lt is not our purpose here to review these 
early reports, Excellent summaries can be found 
in articles by Berle (1948), Derner (1953), Mer- 
rill (1953), Harris (1952), and others. 
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Throughout the experimental literature, 
however, behavior is considered as synony- 
mous with segmental aspects of personality 
(as anxiety, depression, or psychotic-like re- 
actions) or as synonymous with specific 
actions (as leaving the hospital or as work- 
ing after discharge). None of the research 
attempts to evaluate the patient's behavior 
as a total response to a complex environ- 
mental situation. And no study can be 
found which provides: an integration of 
medical, social, psychological and behavioral 
data, as contributed by representatives of 
several professional disciplines ; and a longi- 
tudinal approach covering both hospital and 
postdischarge adjustment. 


The Cooperative Psychological Research 
Projects 


Rationale. The projects that are described 
in this report were based on the underlying 
premise that a person's psychological make- 
up largely determines the manner in which 
he will adjust to the complex changes in 
his life pattern which are demanded by 
hospitalization for treatment of a somatic 
illness such as pulmonary tuberculosis. 
These environmental changes may be clas- 
sified into those related to his hospitaliza- 
tion, his medical treatment, and his return 
to the community after release from the 
hospital. The patient’s adjustment in these 
three areas is viewed as being influenced 
in part by psychological factors. The pres- 
ent research was designed to study the 
nature and degree of the relationships 
between psychological factors and behavior 
in each of the three areas. Accordingly, 
three studies were designed. The first seeks 
to determine the relationship between psy- 
chological factors and the way the patient 
adjusts to his hospital environment. The 
second investigates how his psychological 
makeup relates to the kind of response he 
makes to the medical treatment. The third 
studies psychological factors as they relate 
to the patient’s adjustment following his 
return to the community. 

Research Design. All three of the studies 
provided for the use of the same basic test- 
ing procedures. All patients were given a 
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battery of psychological tests and the in- 
vestigator obtained the same personal his- 
tory and social data in all three studies. The 
primary differences between the studies are 
found in the measuring devices used to 
assess the criterion with which each was 
concerned, i.e., hospital adjustment, re- 
sponse to medical treatment, and posthos- 
pital adjustment. 

Two forms were used to collect the basic 
demographic and factual data for each 
subject. An identifying data sheet was used 
to record information usually found in hos- 
pital records, such as his age, diagnosis, 
military service, etc. A personal informa- 
tion sheet provided additional data available 
only from the subject such as facts about 
his education and family background. The 
subject completed this form at the same 
time as he took the psychological tests. 

Several criteria were used in the selection 
of a suitable psychological test battery for 
these studies. Total testing time should. be 
relatively short so that the test battery 
could be completed in one session and not 
overtax the patients. Optimal time was felt 
to be about one hour. It was also deter- 
mined that tests to be selected should be 
easy to administer, capable of objective 
scoring, appropriate to a wide range of edu- 
cational level, and nonthreatening to the 
subjects. In addition the battery should 
attempt to sample a wide range of psycho- 
logical variables. With some reluctance it 
was determined that many of the tests cus- 
tomarily used by the psychologist did not 
meet these criteria. By applying the selec- 
tion standards the following battery was 
chosen : 

1. Personnel Test for Industry, Form A 
—a 5-minute wide-range intelligence test 
composed of 50 multiple-choice items 

2. IPAT 16 Personality Factor Test, 
Form C—an objective personality test de- 
veloped by factor analysis (Cattell, 1956) 

3. Madison Sentence Completion Test, 
Short Form—20 incomplete sentences de- 
vised specifically for tuberculosis patients 

4. House-Tree-Person Test—a projective 
test in which the subject is asked to draw 


on separate pages a house, a tree, and a 
person (Buck, 1948) 

Two of the four tests, the Personnel Test 
for Industry and the 16 Personality Factor 
Test, are objective tests with standardized 
scoring systems. The Sentence Completion 
Test and the House-Tree-Person are pro- 
jective tests and required the development 
of scoring systems in order to objectify the 
results for use in the statistical analysis of 
the results. Following a survey of the 
literature a number of variables which 
could be judged at a fairly objective level 
were selected, ie., in the person drawing, 
the presence or absence of a belt. The pro- 
jective tests of an initial sample of 500 
subjects were scored using these variables. 
Adequate scoring variables for the “tree” 
and “house” were not available in the litera- 
ture and these were omitted from the study. 

АП of the test scores were intercorrelated 
for this sample and by inspection groups of 
scores which were significantly related to 
each other were selected. From these group- 
ings variables which seemed to make psy- 
chological sense were chosen for later com- 
parison with the criteria for each study. 
This process might be described as one of 
"rational factor analysis." Each of the 
seven groups of items shown in Table A1 
is used as a single personality scale in the 
analysis of the results. 

In order to further facilitate analysis of 
the data it was decided that as far as pos- 
sible the criterion measures to which the 
above psychological tests and the demo- 
graphic variables would be compared should 
be simplified by the development of single 
quantitative measures of the criteria. In 
order to fulfill this requirement, indices of 


hospital adjustment, response to treatment, . 


and readjustment to the community were 
developed. Detailed description of these 
measures will be presented in the sections 
devoted to the specific studies which make 
up the project. 


An Over-All View of the Project 


The protocols and research materials fof | 


each of the three studies were distributed 
to 18 Veterans Administration hospitals 1 
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September 1957. Data collection was com- 
pleted by March 1958 and all materials 
were forwarded to the Central Research 
Laboratory at the Veterans Administration 
Hospital, Baltimore, Maryland. 

The number of patients available for 
analysis of results varies according to each 
study. This results from the fact that par- 
ticipation by each hospital was voluntary 
and some chose to participate in all three 
studies and some in only one or two of the 
three. The total number of patients tested 
in all phases of the project was 814. Forty- 
seven of this number were females and are 
not included in this report. The number of 
subjects used for each of the individual 
studies will be presented in the subsequent 
Sections. 

It may be of interest to note that for the 
total sample of 767 male veterans, the 
average age was 43.1 years. Seventy-six 
percent were Caucasian, 22% Negro, and 
296 from other ethnic groups. Thirty per- 
cent were single, 48% married, 4% wid- 
owed, and 18% separated or divorced. The 
average years of education for the total 
group was 9.9. 

Each of the sections presents the report 
Of one study area, hospital adjustment, re- 
Sponse to treatment, and posthospital adjust- 
ment. Each will describe the rationale, 
design, and development of criterion meas- 
ures which are unique to the specific study. 
The results obtained from comparison of 
the psychological and demographic variables 
to the criteria will then be presented and 
discussed. The final section will integrate 
the findings of all three studies, consider 
their implications, and outline further devel- 
opments planned in this research program. 


HOSPITAL ADJUSTMENT 


) Study of adjustment of tuberculosis pa- 
tients to their hospital experience was 
thought relevant to the present cooperative 
project for a number of reasons, which were 
Presented previously. Because of the com- 
plex nature of the adjustment process, many 
~ facets must be considered in a study of 
hospital adjustment. Theoretically, at least, 
Consideration of the entire gamut of psycho- 


logical and social factors involved would be 
desirable. This would include the basic 
psychological equipment which the patient 
brings with him to the hospital situation, 
the specific characteristics of the ward and 
of the hospital which the individual experi- 
ences as a patient, and the extrahospital 
environmental factors which are of impor- 
tance to the patient. This intensive approach 
was not possible in the present study. In- 
stead, broad classifications of interpersonal 
and situational factors were made and serve 
as orienting points around which the proc- 
esses involved in adjusting to the hospital 
situation could be ordered. 

This section will consider the hospital 
adjustment area of the project and will 
include a description of the measurement 
devices used, the statistical method em- 
ployed, and the attempts to develop an index 
which would facilitate comparison of hos- 
pital adjustment with other areas of the 
total study. How well the psychosocial 
variables predict this index and the char- 
acteristics of "good" and "poor" hospital 
adjusters will also be presented. 


Measurement 


The Measuring Device. "Hospital adjustment," 
for purposes of this study, refers to complex sets 
of behaviors which are determined by several 
factors. These factors do not lie solely in the 
need structure which the individual brings to the 
institutional setting; the atmosphere of the hospital 
and the reactions toward the patient on the part 
of the hospital personnel can exert a large influence 
on the behavior from which the character of his 
hospital adjustment will be ultimately judged. 

In measuring hospital adjustment, then, views of 
personnel toward the patient may be as important 
data as are the attitudes of the patient. Thus four 
multiple-choice rating scales were developed? to 
be completed by the patient, his physician, and 
the nurse and the aide who knew him best, respec- 
tively. The physicians form contained 8 items; 
those for the nurse and the aide contained 11 each; 


3 Appreciation is expressed for the contributions 
to the development of these scales made by the 
following: Theodore Andreychuk, Rohrer, Hibler, 
and Replogle, formerly at Veterans Administration 
Hospital, Downey, Illinois; and Leon Soffer, 
Veterans Administration Regional Office, Phila- 
delphia, Pennsylvania, formerly at Veterans Ad- 
ministration Hospital, Downey, Illinois. 
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the patient's form contained 22 multiple-choice 
items and was supplemented by 5 sentence com- 
pletion items designed to give the patient some 
latitude in describing his feelings toward tuber- 
culosis. All four forms included a space at the 
end in which the rater could make additional 
remarks.* 

Criteria for selection of items and areas to be 
covered were that: (a) they reflect, in the investi- 
gators’ judgment, a significant aspect of the hos- 
pital situation; (b) the information asked be 
pertinent for the experience of the rater; (c) they 
relate to areas of hospital routine and inter- 
personal interaction ; (d) the task of completing 
the ratings would take a minimum of personnel 
time. 

Tt will be noted that the four scales are not exactly 
the same, Although their being homologous might 
have simplified statistical operation, there is rea- 
son to doubt that all personnel were equally 
capable of judging, for example, the patient's 
sleeping and eating habits or his relationships 
with other patients and with personnel. Thus, each 
rater was asked to make only those judgments 
about the patient which his particular interaction 
with the patient qualified him to make. 

Some Statistical Considerations. Ratings de- 
scribed above were completed on 500 male pa- 
tients, Frequency tabulations of the response choices 
{ог each item were compiled, cutoff points which 
dichotomized the response distributions were 
chosen, and tetrachoric correlations were com- 
puted among all items from the four scales. А 
summary of results showing the percentage of 
significant interitem correlations is given in 
Table 1. 

For all subsequent statistical analyses, the 
original sample of 500 male patients was reduced 
to 350 in order to include only those for whom all 
rating and psychological test data were complete; 


4 The rating scales used in this study have been 
deposited with the American Documentation Insti- 
tute. Order Document No, 6740 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress; Washington 25, 
D. C, remitting in advance $1.75 for microfilm 
or $2.50 for photocopies. Make checks payable to: 
Chief, Photoduplication Service, Library of Con- 
gress. 


5' The Probable Error of a tetrachoric correla- 
tion may be obtained readily by the formula 


6745 т 
РЕ = JET when each of the dichotomized 


distributions has been divided at its midpoint. 
However, in the present study the distributions 
were unequally divided for the most part. Under 
these circumstances, the precise formula for the 
PE is cumbersome to the point of being eco- 
nomically impossible to apply, particularly in view 
of the fact that the PE must be computed anew 


TABLE 1 


Tue PERCENTAGES OF SIGNIFICANT INTERITEM 
CORRELATIONS WITHIN AND BETWEEN 
RATER GROUPS 


Physi- 
Rater cian Nurse Aide | Patient 
%) (%) (%) (%) 
Physician 50 62 46 31 
Nurse 75 52 27 
Aide 56 25 
Patient 57 


Note.—re = .20 or greater, 


and to exclude the group of tuberculosis patients 
in neuropsychiatric hospitals from these subse- 
quent analyses, since it was considered desirable to 
include them in a separate study. 

Within each of the four scales 50-75% of the 
correlations are significant as shown in Table 1,5 
When items are compared among scales (ie, 
between different categories of raters), the per- 
centage of significant relationships drops. This 
drop is comparatively small when ratings of per- 
sonnel are compared, but much larger when patient- 
personnel ratings are compared. Percentage of 
significant relationships among personnel ratings 
is highest between the ratings of nurse and physi- 
cian (62%) ; the percentage of significant relation- 
ships between aide and physician ratings and aide 
and nurse ratings are lower (46% and 52%, 
respectively). In personnel-patient correlations the 
percentages range from 25% to 31%. 

Consideration of these results suggests that the 
patient’s behavior is viewed with some similarity 
by personnel who rate him and by the patient 
himself, This lends support to the view that the 


for each correlation as the dichotomizing points 
change. 

The dichotomies on which the tetrachoric cof 
relations in Tables 1, 3, and 4, are based may be. 
treated as though they were divided at their mid- 
points, in order to estimate the conventional. 
significance points. For the sample of 500 cases 
this procedure yields an estimated PE of 047. 
Based on this, correlations of .14 and .19 are 
required for rejecting the hypothesis of no rela- 
tionship at, respectively, the 05 and .01 levels 9 
confidence; the chosen correlation value of 4 
which discussion of results is based correspon 
to a р value of .01. But it must be kept clearly ! 
mind that these estimates of significance level are 
systematically inflated over those which wou 
result from the correct formula for the РЁ 
For the reduced sample of 350 cases, the 65 
mated PE is .056; the value of .18 сопзі b 
significant in Tables 3 and 4 corresponds 10 $ 
p value of .03. t 
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scales assess some consistent behavioral phenomena 
in the interpersonal situation of tuberculosis hos- 
pitalization. 


Development of Indices of Hospital 
Adjustment 


The Hospital Adjustment Index. lt was neces- 
sary to reduce the 60 hospital adjustment items to 
a single index of "hospital adjustment" (or a 
limited number of indices reflecting several aspects 
of adjustment) in order to facilitate relating this 
area statistically to various other types of informa- 
tion. Two a priori attempts to derive such an 
index were made. One, utilizing the skills of the 
clinical psychologist as a rater, was abandoned 
when a pilot study showed restricted variations 
among scores and hence lack of discrimination. 
An item centered approach, however, was more 
successful and it was devised in the following 
manner: First, each item in the four forms was 
examined in the light of the frequency distribution 
Of its several response alternatives. Where dis- 
tributions were such that they did not discriminate 

, among patients (i.e, did not provide a sufficiently 
large range of scores), these items were deleted 
from further consideration. This was the case, 
for example, in the items on surgery in the 
physician's form, the items on unsanctioned dis- 
charge in the patient's form, and the nonobjective 
Sentence completion items in the patient's form. 

Forty-four of the 60 items remained after this 
procedure: 4 items from the physician's form, 11 
from the nurse's form, 11 from the aide's form, 
and 18 from the patient's form. Each of these 
was studied carefully for the purpose of making 
a clinical judgment as to the meaning of each 
response possibility with respect to the quality 
of a patient's adjustment. Then, a 2, 1, 0, or —1 
value was assigned to each alternative, with a 
2 indicating a response which was believed to 
Teflect the best adjustment to the hospital situa- 
tion. For example, it was believed that in Item 1 
of the physician's form a check at “b,” indicating 
that the patient was seen by his physician as 
asking {ог explanations as to his tuberculosis 
Condition an average amount, might well be in- 
terpreted as better adjustment than a check at 

“а” ("asks more than most patients," hence, by 

implication, extremely anxious or dependent) or 
at "c" (“seldom asks for explanation"). A check 
at "b" was hence assigned a value of 2; a check at 

а” was assigned a value of 1; and a check at “c” 

Was assigned a value of 0. Minus-one values were 

assigned in instances where there seemed to be 
unequivocal evidence of poor adjustment, as the 

Physician rating the patient as being in constant 

difficulty, the aide rating the patient as getting 
into trouble because of the use of alcohol, the 

Patient rating the personnel as being usually or 

most always unpleasant, and the like. 
Thus a scoring key for hospital adjustment as 
reflected in each scale was derived. For each of 


the four rating forms, these values could be 
summed algebraically to yield an index of adjust- 
ment as seen by each of four people. The sum of 
these four scores for the individual patient repre- 
sents the estimate of his over-all hospital adjust- 
ment, called the Hospital Adjustment Index. 

Table 2 presents the results of the Hospital 
Adjustment Index (hereinafter referred to as the 
H-A Index) when applied to the final sample of 
350 patients. As can be seen in the table, the mean 
H-A Index was 45, and the range was 22 to 61; 
(lowest possible score was —10 and highest pos- 
sible score was 73). The distribution approximates 
a normal опе. 

Factor Scores. In order to broaden the inter- 
pretative base of the hospital adjustment measures 
in this study, an additional procedure was under- 
taken, This consisted of a factor analysis of 30 
items from the Hospital Adjustment Forms, 
selected to include as broad a sampling as possible 
of the 60 items in those forms. Ten factors were 
extracted, but four of these were either composed 
of only two items with significant loadings or 
included only complex variables which already 
were significantly loaded on other factors. Thus, 
only the following six factors entered into further 
statistical study: I. Cooperative, II. Positive atti- 
tude toward hospital, IIT. Social activity, IV, Sleep 
and appetite problems, VII. Passivity to hospital, 
and X. Difficult to accept patient role. 

Each factor has a number of the original indi- 
vidual items with significant loadings. Since all of 
the items had been previously dichotomized into 
a plus and minus group to allow for tetrachoric 
correlations, it was easy to derive a factor score 
for each individual: a patient’s score is the num- 
ber of factor-associated items on which he fell in 
the plus group. These factor scores were deter- 


TABLE 2 


DISTRIBUTION OF HOSPITAL ADJUSTMENT 
InDEX SCORES 


(М = 350) 
Score Frequency 
Minus scores: 
22-25 7 
26-30 6 
31-35 20 
36-40 44 
41-45 73 
Cutoff point 
Plus scores: 
46-50 105 
51-55 81 
56-60 13 
61 1 
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TABLE 3 


RELATIONSHIPS BETWEEN THE H-A INDEX AND SIX 
OF THE HOSPITAL ADJUSTMENT FACTORS 


Factor H-A Index 

(re) 

1. Cooperative .61 

II. Positive attitude toward hospital (53 

III. Social activity .48 

IV. Sleep and appetite problems —.34 

VII. Passivity to hospital .46 
X. Difficulty in accepting role of 

patient —.41 


mined for each patient on all factors; the distribu- 
tions were dichotomized to provide a plus or 
minus score for each of the six factors. It was 
then possible to compute tetrachoric correlations 
between these factor scores and the other measures 
in the study. 


Table 3 shows the relationships between the 
factor scores and the H-A Index scores described 
earlier. The correlations are all statistically sig- 
nificant, which suggests that the H-A Index re- 
flects many of the same features of the rated 
responses to the hospital adjustment forms as are 
represented in the factor scores, This finding is 
not surprising, in view of the method used in 
developing the H-A Index. As noted above, these 
additional scores were computed in order to utilize 
the fullest amount of hospital adjustment data in 
subsequent relating of hospital adjustment to other 
measures. 


Relationships to Other Data 


Up to this point, the discussion has cen- 
tered about the procedures used and the 
results obtained within the area of hospital 
adjustment. The present section will con- 
sider the relationships obtained between the 
hospital adjustment data and selected demo- 
graphic and psychological test data. 

Table 4 contains the correlations between 
the hospital adjustment data (both index 
and factors) and the demographic and psy- 
chological test data. With respect to the 
method of computing the correlations for 
the six factors, it will be remembered that 
factor scores were used; the resulting dis- 
tributions for each factor were dichoto- 
mized, and then tetrachoric correlations 
were computed between these and other 
data. By dichotomizing the H-A Index dis- 
tribution, the same procedure was followed. 


As has been indicated in-the preceding sec- 
tion, correlations of .18 or greater were 
considered statistically significant. 

It is apparent that none of the correla- 
tions shown in Table 4 is sufficiently large 
to account for a high percentage of the 
variance, and thus one is hard put to talk 
about the results obtained here in terms of 
“degree of predictability.” 

Nevertheless, there are a number of sig- 
nificant correlations, pointing to relation- 
ships we can examine. These correlations 
cluster themselves about a few nodal points, 
which might be labeled “Youth,” “Positive 
attitude toward the hospital" and “Раѕ- 
sivity." We shall discuss them in that order. 

Youth (defined as being 39 years of age 
or less) correlates significantly with five of 
the six factor scores and with the H-A 
Index. The young group is not coopera- 
tive, has a negative attitude toward the 
hospital, and gets low scores on the H-A 
Index. At the same time, these patients are 
socially active and tend to be without sleep 
and appetite problems. 

Positive attitude toward the hospital at- 
tracts attention because of its number of 
significant relationships with other variables. 
For this reason, too, it may be seen as a 
“bridging” factor, in the attempt to get at 
the interactional aspects of the data. The 
most striking common feature of the vari- 
ables significantly related to this factor is 
the negative psychological connotation. To 
cite some of these characteristics, the pa- 
tients with positive attitudes toward the 
hospital can be characterized by the follow- 
ing: low intellectual achievement, low in- 
telligence, not “bright” (Cattell Factor B); 
and lacking in “independent security." Even 
the positive relationship with the Cattell 
“trustfulness” factor could be interprete 
to suggest a lack of psychological strength, 
in the sense of sheep-like acquiescence. 

Passivity, as Youth and Positive attitude 
toward the hospital, has a large number of 
significant intercorrelations if one considers ` 
together passivity as defined by the projec- 
tive tests and by Hospital Adjustment 
Factor VII (an at least tentatively accept- qe 
able procedure). The results suggest that 
the passive patient is the cooperative indi- 
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vidual who finds little difficulty in accepting 
the role of a tuberculosis patient. He scores 
high on the H-A Index. He is free from 
anxiety (at least the type found in the 
projective score) and, as might be expected, 
is not endowed with much independent 
security. The contrast between passivity and 
difficulty in accepting the role of patient 
is rather strikingly shown by comparing the 


Factor VII (Passivity to hospital) and 
Factor X (Difficult to accept patient roles) 
columns; one is struck by the frequency in 
which the sign of one factor is the opposite 
of the other in each pair of correlations. 

The three areas which have been con- 
sidered represent what are felt to be the 
most important correlational clusters in the 
data of Table 4. They derive their impor- 


TABLE 4 


RELATIONSHIPS BETWEEN HOSPITAL ADJUSTMENT DATA AND SELECTED DEMOGRAPHIC 
AND PSYCHOLOGICAL Test DATA 


Factors 
I II ш IV VII m 
Positive 
Data H-A | Coopera- | Attitude | Social Sleep | Passivity | Patient 
Index tive to Activity | Problems to Role 
Hospital Hospital | Нага 
Demographic: 
Age (39 years or less) —.18* | —.25* | —.44* .20* | —.29* | —.08 .19» 
Married 104 | —.01 | —.01 13 | —.04 | —.16 13 
Education (11 grades +) 07 —.03 —.24* zit —.16 —.14 .07 
Occupation (white collar) ED .27* .01 .09 .12 —.03 .09 
Composite Projective Test 
Scores: 
Family important —.08 .04 —.22* .18* | —.09 —.15 13 
Passivity .23* .35* zi 01 | —.03 123" | —.19* 
Anxiety .07 .08 | —.05 02 2 | —.25* 02 
Independent security —.15 —.05 —.35* —.09 —.04 —.26* 17 
Psychological Tests: 
ES (Intelligence) —.04 .04 —.35* 13 14 17 15 
attell 16 PF test 
MD no distortion —.16 —.11 —.20* —.02 —.08 —.09 .20* 
А warm, outgoing .02 —.02 13 .22* | —.08 04 08 
B bright —.10 -.1 —.19* E - - - 
C mature 13 —.01 —.12 10 —.13 17 12 
E dominant 02 —.18 —.07 13 —.01 04 08 
F not depressed —.03 —.05 —.09 .09 .00 —.10 - i 
G conscientious .04 .16 EN .05 .10 .02 - 
H — adventurous 14 —.02 20* |. .06 03 .08 —.15 
I not sensitive =11 E06: | S02 рО 0 222" At 02 
L trustful .14 .04 28* | —.06 .08 14 01 
M conventional .18 14 02 02 |—.07 40 |—.12 
N sophisticated ES" —.07 —.02 —.02 4 .05 11 
O  notanxious —.07 | —.19* | —.09 05 .20* | —.07 —.09 
Qı experimenting —.19* .04 —.17 —.07 —.05 .00 03 
Q: self-sufficient .01 .07 —.03 —.01 —.06 —.14 —.11 
Qs controlled 42 .16 —.05 18* | —.01 .01 =.11 
Qu поё tense 14 —.05 14 15 —.10 17 —,12 


$ Values not computed. 
* Significant at .03 level. 
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tance from the possibility that they may 
serve as links in relating hospital adjustment 
to other areas of interest concerning the 
tuberculosis patient. 


Summary 


This section has described the techniques 
employed in assessing hospital adjustment 
of tuberculosis patients. Hospital adjust- 
ment was defined as consisting of sets of 
behaviors determined by many factors both 
within the individual and in his environ- 
ment. Thus personnel as well as patients 
were asked to rate the patient’s behavior on 
multiple-choice rating forms. Five hundred 
male patients, with their associated person- 
nel, participated in this rating. The result- 
ing response distributions for each of the 
total of 60 items which was rated were 
dichotomized, and tetrachoric correlations 
were computed among the items. 

In order to relate hospital adjustment to 
other data in the project, it was necessary 
to develop an index of hospital adjustment. 
Three potential methods for reducing the 
60 items in the hospital adjustment forms 
were discussed. Of these three, the H-A 
Index (the algebraic sum of weighted 
values assigned to the response alternatives) 
was deemed most feasible for subsequent 
statistical computations. This index, as well 
as six factors which resulted from a factor 
analysis of 30 of the 60 hospital adjustment 
items, was then related (via tetrachoric 
correlations) to selected demographic and 
psychological test data for a sample of 350 
patients. This 7 by 26 matrix was inspected 
for clusters of correlations, and three such 
clusters were described. 

On the basis of these clusters, it may be 
concluded that the type of hospital adjust- 
ment which is characterized by contented- 
ness in the hospital, ease in cooperating with 
rules and regulations, and favorably im- 
pressing personnel, is more frequently ob- 
served among older, more passive patients 
than in younger, more independent ones. 
The patient in whom the index and factors 
indicate this pattern of adjustment is with- 
out sleep and appetite problems and is not 
anxious. He may be somewhat less intelli- 


gent than patients who display the obverse | 
pattern, but he gets along well with people 
around him. In short, he is a person who 
has little difficulty in accepting the role of 

a long-term patient, presumably because - 
that role suits his needs and those of hos- 
pital personnel very well. 

This formulation is broader than hospital 
behavior, considered alone, since the related 
variables of background and psychological 
test variables have been introduced. How 
these predictor variables relate to physio- — 
logical aspects of the disease and to post- 
hospital adjustment will be discussed, re- 
spectively, in the next two sections. 

| 


RELATIONSHIPS BETWEEN PSYCHOLOGICAL 
VARIABLES AND RESPONSE TO TREATMENT 
ron PULMONARY TUBERCULOSIS 


As with all medical conditions, some pa- 
tients seem to recover faster from pul- 
monary tuberculosis than do others who - 
have the same amount of disease. Does. 
something in their psychological makeup - 
account for this differential rate of re 
covery? That is the central question of the” 
study which is reported in this section. | 

Few of the previous research studies of 
the role of personality variables in tubercu-- 
losis have attempted to differentiate between ^ 
patients who make a good and those who 
make a poor response to treatment. While: | 
Brotman's (1955) research was directed at 
this question and Ellis and Brown (1950) - 
maintain that their studies using the Ror | 
schach responses substantiate the hypothesis” 
that mental and emotional factors are T€ 
lated to response to treatment, most of the 
other reports have concerned the role of 
emotional factors only as they serve to 
describe tuberculosis patients as a group. — 


Procedures 


A description of the general procedures used. 
in the selection and testing of subjects for 
research project has been given previously ang 
will not be repeated at this point. An additional 
form was used for this portion of the prole? 
Information required to complete a Response E 
Medical and Surgical Treatment Form includ 4 
the results of the bacteriological tests 0 ; 


e 
patient's sputum or gastric content for evident 


1 
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of the tubercle bacillus. In addition, information 
was recorded in regard to the results of the 
periodic X-ray films as they indicated the progress 
the patient made toward stability of the disease 
process and closure of cavities where these had 
existed as part of the disease. This form was 
completed by the investigators in consultation 
with the patient's ward physician. 

Complete medical data were obtained on 78 
patients, While this is a comparatively small 
group, it was felt that the importance of this 
aspect of the research justified statistical analysis 
of the data. It was our hope that hypotheses 
might be derived which would provide a basis 
for more extensive research studies. 

Three measures of response to treatment were 
selected and defined in terms of the rate of 
achieving certain goals. The rate of bacterio- 
logical conversion was defined as the number 
of months from the date of initiation of chemo- 
therapy to the date of the first of a series of 
uninterrupted concentrate or culture reports which 
failed to reveal the presence of tubercle bacilli. 
X-ray stability rate was defined as the number 
of months from the date of initial chemotherapy 
to the date of the first X-ray film which the 
ward physician considered to show no significant 
change in the disease process for a period of 
3 months. Cavity closure rate was defined as the 
number of months from the initiation of chemo- 
therapy to the date of the first X-ray film judged 
by the ward physician to show closure of all 
cavities which may have existed previously. 

It seemed highly desirable that these three 
measures be combined into a single index of re- 
sponse to treatment and several small studies were 
conducted to determine the best method to use 
to accomplish this. A determination of the inter- 
relationships between the three variables, using the 
data available on the subjects in this phase of the 
Project, revealed the fact that X-ray stability and 
cavity closure are highly correlated (r — .92) and 
bacteriological remission and X-ray stability are 
moderately correlated (r — .66). It was decided, 
therefore, that an index consisting of X-ray 
Stability and bacteriological remission would ade- 
quately represent all three variables. 

In another survey, a questionnaire was devised 
to sample the attitudes of physicians who were 


—— 

A The restricted number of subjects available for 
this phase of the studies probably resulted from 
the fact that since participation in each phase of 
the project was voluntary only a limited number of 
investigators chose to collect data on response to 
treatment which required a 12- to 18-month wait 
until all data were available and the completion 
of a form requiring the detailing of medical 
information. It is interesting to note that, using 

emographic data, comparison of the patients avail- 
able for this study with the total sample revealed 
no significant differences between the two groups. 


actively engaged in the treatment of pulmonary 
tuberculosis. They were requested to indicate their 
opinions as to the relative importance of bacterio- 
logical remission and X-ray stability as indicators 
of response to treatment. Fifteen physicians, repre- 
senting five Veterans Administration hospitals * 
participated in filling out this questionnaire. Sixty- 
seven percent indicated that bacteriological remis- 
sion is the more important measure, 22% felt 
X-ray stability to be more important, and 11% 
considered them equal in importance. The mean 
rating for bacteriological remission (out of a total 
possible rating of 10) was 6.0 and for X-ray 
stability was 4.0. This ratio of 1.5 to 1.0 was, 
therefore, adopted and the Index of Response to 
Treatment used in this study is made up of 1.5 
times the number of months it takes to achieve 
bacteriological remission plus the months to attain 
X-ray stability. 

A separate survey was also conducted to deter- 
mine the reliability of judgments of the variable 
of X-ray stability by physicians who were experi- 
enced in the evaluation of X-rays as a measure 
of response to treatment. Twelve physicians at 
two Veterans Administration hospitals? were 
asked to indicate which of four X-rays on the 
same patient represented the one indicating sta- 
bility. An over-all agreement in rating of 72% 
was obtained. No study of bacteriological remis- 
sion data reliability was made but it is probable 
that it is of satisfactory reliability as it is derived 
from the use of standard laboratory procedures. 
It seems justified, therefore, to accept the index 
as having adequate reliability for this type of 
rating procedure. 

It was next determined that there was a 
statistically significant difference in mean index 
values for the group of patients diagnosed as 
moderately advanced and far advanced.) As a 
result these two groups of patients have been 
considered individually in the analysis of the 


results. 


Results 


The general plan for analysis of the 
results was to compute medians for the 
Index of Response to Treatment scores and 
for the demographic and psychological vari- 
ables and to compare patients scoring above 
and below the medians by the use of four- 


"Livermore, California; Long Beach, Cali- 
fornia; San Fernando, California; Downey, 
Illinois; and Baltimore, Maryland. 

3 Long Beach, California, and San Fernando, 
California. 

э The 1955 diagnostic standards of the National 
Tuberculosis Association were used in the deter- 
mination of diagnostic categories. 
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TABLE 5 


Cut SQUARES AND PHI COEFFICIENTS FOR RELATIONSHIPS BETWEEN INDEX OF RESPONSE 
TO TREATMENT AND DEMOGRAPHIC VARIABLES 


Moderately Advanced Far Advanced 
Variable* (N = 48) (N = 30) 

x $ 22 Ф 
Under 40 years of age .36 .09 ES .12 
Presently married 3.14 .26 $23 —.13 
Above 11 years of education 1.42 .18 .00 .00 
White collar or better occupation .42 au 212 —.15 
None or low government compensation 24 .07 .28 —.10 

No previous admissions for tuberculosis .49 .10 -b - 
No other medical diagnoses 1.40 A ra .29 —.10 


a Description is characteristic of above median group. 


* Value not computed inasmuch as all patients but one were first admission. 


fold tables. This procedure appeared to be 
adequately sophisticated for an exploratory 
study. Chi square has been used for the 
analysis of data and phi coefficients have 
been computed to provide an indication of 
the direction and degree of correlation be- 
tween the variables. Evidence of statistical 
significance has been obtained by submitting 
the chi square values to the appropriate test 
with one degree of freedom. 

It is well to keep in mind the fact that 
in a study employing a number of variables 
there is а possibility that a certain number 
of significant findings may be obtained on 
a chance basis. Of the 28 psychological 
variables which were studied, 2 might prove 
significant at the 5% level by chance alone. 

Table 5 presents the comparison of the 
demographic variables with the Index of 
Response to Treatment for the moderately 
advanced group of patients and for the far 
advanced patients. Inspection of the table 
reveals that none of the demographic vari- 
ables is significantly related to response to 
treatment in either of the diagnostic groups. 


19 Correction for continuity was applied when 
the smallest expected cell frequency was less than 
5. The use of a nonparametric statistic was indi- 
cated by the fact that frequency distributions of 
the index scores for both groups appear to be 
skewed in the direction of high scores or poor 
response to treatment. 


Results of the test of intelligence level 
and the 16 Personality Factor Test are pre- 
sented in Table 6. The correlations with the 
response to treatment index are uniformly 
low for the group with moderately advanced 
disease and we must conclude that no 
evidence of relationships between these vari- 
ables has been found. 

For the far advanced group, however, 
there is evidence of significant relationships 
between several of the personality variables 
and response to treatment. Factor O, not 
anxious, is significant at the 1% level; 
Factor Q41, experimenting, at the 5% level; 
and Factor MD and the combined anxiety 
factors are significant at close to the 5% 
level. The Cattell factor score results will 
be discussed in terms of the descriptive 
statements given by the test authors. The 
reader should bear in mind that the inter- 
pretation of meaning to factors is a highly 
subjective task. This is especially true when 
each factor is composed of but a small num- 
ber of items as is the case in the 16 PF test. 


The finding that is of the highest degree 
of statistical significance (ф = .58, estimate 
r = .91) is the relationship between high 
scores on Factor O of the 16 PF test and 
good response to treatment. This factor 18 
said by the test authors to measure per: sonal 
confidence and freedom from anxiety. Per- 
sons scoring at this end of the dimensio” 
are described in the test manual as tending 
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to be “placid, calm, with unshakable nerve. 
He has a mature, unanxious confidence in 
himself and his capacity to deal with things" 
(Cattell, 1956). Conversely, patients with 
far advanced disease who have poor re- 
sponse to treatment tend to have scores on 
this factor which indicate that they "tend 
to be depressed, moody, suspicious, brood- 
ing, avoiding people. They have a childlike 
tendency to anxiety in difficulties" (Cattell, 
1956). 

The second factor on which a relationship 
to response to treatment is found (¢ = .38, 


TABLE 6 


Cui SQUARES AND PHI COEFFICIENTS FOR RELATION- 
SHIPS BETWEEN INDEX OF RESPONSE TO TREAT- 
MENT AND OBJECTIVE PSYCHOLOGICAL 
TEsT SCORES 


Moderately Far 
Advanced Advanced 
Test (N = 48) (N = 30) 
ma Ф D Ф 
PTI Verbal А .00 .00 E 10 
Cattell Factors: 
MD nodistortion | .36 .09 | 3.80 34 
warm, 
outgoing 2.71 24 59 |=—.15 
B bright .35 .09 -10 16 
C mature .38 .09 | 1.70 .24 
E dominant Ri .06 .59 |—.15 
F  notdepressed| .36 .09 .59 |—.15 
G  conscientious | 1.87 .20 .30 40 
Н adventurous | 140 | .17 | 147 |—23 
l not 
sensitive .00 .00 .62 Rin 
L  trustful 35 09 1.47 23 
M conventional | 1.48 .18 .59 45 
N sophisticated | .64 12 .00 .00 
O  notanxious sai? .06 | 9.96**| .58** 
Qi experi- 
menting .00 .00 | 4.83* | .38* 
Qv self- 
sufficient .38 .09 .56 14 
Qs controlled | 144 | .18 | .36 |—.11 
Q not tense 36 | .09 | .00 | .00 
Low combined 
anxiety 88 |—44 | 3.74 | .37 
Low combined 
neuroticism .02 .02 44 .07 


estimated r = .59) is Оу, experimenting 
versus conservative. Good response to treat- 
ment appears to be associated with the 
person who "tends to be interested in intel- 
lectual matters and fundamental issues. He 
frequently takes issue with ideas, either old 
or new. He tends to be more well informed, 
less inclined to moralize, and more tolerant 
of inconvenience" (Cattell, 1956). Poor re- 
sponse to treatment is found in patients in 
this study who give scores indicating that 
they are cautious and opposed to any 
change. 

Two other scores on the 16 PF test may 
be worthy of attention. Both had chi square 
values which were close to the 3.84 which 
is required for significance at the 5% level. 
The MD scale of the Cattell test yielded a 
chi square of 3.80. This scale was designed 
as a measure of reliability of answers and 
may be considered as a rough measure of 
the amount of distortion present in the 
answers. The results suggest that patients 
with far advanced disease who respond to 
treatment slowly tend to have less reliable 
Cattell scores. 

In addition to the usual scoring of the 
16 PF test, two scores representing a com- 
bination of factor scores were computed.* 
One of these—neuroticism—did not signifi- 
cantly relate to response to treatment in 
either group. The other—anxiety—yielded 
a chi square of 3.74 and a phi coefficient 
of .37 with the index for the far advanced 
group. This measure is made up of a com- 
bination of scores on Factors L, О, Оз and 
Оң. In several research studies these fac- 
tors have been found to constitute among 
the best markers of the anxiety factor, des- 
ignated 0124 in the Universal Index System 
proposed by Cattell (1957). One of the 
factors—O—has been previously discussed 
as being significantly related to response to 
treatment. 

Results of the comparison of the projec- 
tive test variables with response to treat- 
ment are given in Table 7. None of the 
findings is of statistical significance for 
either diagnostic group. 


as Significant at .05 level. 
Significant at .01 level. 


i1], H. Scheier, personal communication, 1959. 
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TABLE 7 


Cur SQUARES AND PHI COEFFICIENTS FOR RELATION- 
SHIP OF INDEX OF RESPONSE TO TREATMENT 
TO THE CoMPOsITE PROJECTIVE 
Test SCORES 


Moderately Far 
Advanced Advanced 
Composite (N = 48) (N = 30) 
Projective 
Scores 


Family important .00 .00 |147 |—.23 
Evasion 1.87 .20 | 1.70 24 
Passivity 36 .09 .30 |—.10 
Anxiety 40 .09 .00 .00 
Moral judgment .00 .00 | 1.70 24 
Control 3.33 27 28 |—.10 
Independent 

security .00 .00 .30 .10 


Perhaps of greater practical importance 
than the foregoing analysis of the results 
is the development of a regression equation 
for use in the prediction of response to 
treatment from known values for the psy- 
chological test variables. Application of 
such an equation to a new group of patients 
would permit a test of the validity of the 
findings and an evaluation of the potential 
usefulness of the findings in the hospital 
treatment program. 

The lack of significant results for the 
patient with moderate disease argues against 
any attempt to use these findings for predic- 
tive purposes. Development of a regression 
equation for the far advanced group would, 
on the other hand, appear to be fruitful. 


The three Cattell scores, MD, O, and Q1, 
which have a statistically significant corre- 
lation with the Index of Response to Treat- 
ment were selected for the multiple regres- 
sion equation. Means, standard deviations, 
and intercorrelations for these factors are 
shown in Table 8. 1 

A multiple phi coefficient of correlation 
between these three factors and the Index 
of Response to Treatment is .70. This value 
may be compared with the highest single 
correlation of .58 between the index and 
Factor O. The use of the multiple factors 
in the regression equation appears to be of 
value, therefore, as it results in a 15% 
increase in the variance that can be ac- 
counted for in the prediction of the index. 
The regression equation resulting from the 
use of the data in Table 8 is as follows: 
Хат = 1.53X4 + 2.79X2 + 2.04X; — 17.56. 
Хат is the response to treatment index score, 
X, the stanine standard score on Factor 
MD, Х» the stanine standard score on 
Factor О, and X, the stanine standard score 
on Factor Qj. 


Discussion 


This investigation of relationships be- 
tween psychological variables and response 
to treatment in pulmonary tuberculosis has 
been essentially exploratory in nature. No 
specific hypotheses were formulated prior 
to initiation of the study. The general plan 
was to present tests which would represent 
a wide sampling of psychological variables. 
The results might then provide a source for 
hypotheses which could be tested in subse- 


TABLE 8 


Means, STANDARD DEVIATIONS, AND INTERCORRELATION VALUES FOR THREE CATTELL FACTOR SCORES 
AND THE RESPONSE TO TREATMENT INDEX FOR THE GROUP WITH FAR ADVANCED DISEASE 


Intercorrelation (¢) 


Variable Mean SD 
Factor MD Factor O Factor Qi - 
Response to Treatment Index score 19.7 10.7 .34 .58 .38 
Factor MD 5.0 1.9 14 .00 
Factor O 6.8 1.9 AS 
Factor Qi 5.2 1.6 
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quent research. An additional goal was the 
formulation of a method for predicting 
response to treatment such as has been 
described in the preceding section. 

The results have indicated that for the 
patients with moderately advanced pul- 
monary tuberculosis none of the demo- 
graphic or psychological variables which 
were included in the study has a significant 
relationship to response to treatment. Re- 
sults obtained for the group of patients 
with far advanced disease, on the other 
hand, have indicated that certain of the 
psychological variables do appear to be sig- 
nificantly related to response to treatment. 
More evidences of such relationships were 
obtained than would be expected by chance. 

The findings may be interpreted as sug- 
gesting the hypothesis that patients with 
far advanced disease who are relatively free 
from anxiety reactions during their hos- 
pitalization will be likely to have a good 
response to treatment. The personality test 
results indicate that this freedom from 
anxiety may result from a secure personal- 
ity structure, from an effective use of 
defenses such as withdrawal, or from such 
traits as passivity and submissiveness. 

The principle basis for such a hypothesis 
comes from the significant correlation ob- 
tained for Factor О of the Cattell test. The 
score on this scale is a principle marker for 
the general anxiety factor and appears to 
indicate a state of personal security and 
freedom from proneness to guilt reactions. 
The patient scoring high on this factor 
should be able to tolerate the stresses which 
result from hospitalization for tuberculosis 
without excessive anxiety. The other two 
factors which are most highly related to 
response to treatment may also be inter- 
preted as being consistent with the hypoth- 
esis. The score on Factor MD apparently 
indicates that patients who do well have 
relatively less need to distort their answers 
in a defensive or unreliable manner. Factor 
O1 appears to reflect a flexibility of person- 
ality which would permit the patient to 
adapt himself to the hospital situation with 


' a minimum of anxiety. 


Тһе disparity between the results obtained 
for the patients with moderately advanced 


disease and those that have been found for 
the group with far advanced disease pre- 
sents a difficult problem in interpretation. 
Why should relationships be found between 
psychological variables and response to 
treatment in one group and no evidence of 
similar relationships be seen in another 
group that differs only in severity of dis- 
ease at the outset of treatment? One pos- 
sible explanation might hold that there are 
two processes involved in response to treat- 
ment and that these relate in an antagonistic 
manner. The first is the physiological ability 
of the host organism to use chemotherapy 
to inhibit the disease process and promote 
the healing recovery from the disease. A 
second process would be the psychological 
reactions which may deter recovery. With 
lesser states of disease the potentiality of 
the recovery processes might be of sufficient 
vigor to outweigh the deterent effects of 
psychological factors. Only when the dis- 
ease is severe and therapy is relatively 
impotent would psychological variables ex- 
ert an important influence. 

If this explanation is a valid one, the 
results of the present study would be ex- 
pected to show tendencies to relationships 
in the group with moderately advanced 
disease for the same variables that are sig- 
nificantly related in the group with far 
advanced disease. This would follow from 
the fact that the two diagnostic groups are 
not discrete but represent a continuum of 
severity which has been artificially forced 
into two groups. However, no such pattern 
of results has been noted for the two groups. 
Where variables are significant for the 
group with far advanced disease there is 
little evidence of correlation for the group 
with moderately advanced disease and there 
are many instances in which the two groups 
correlate in opposite directions. 

А tenable explanation of the differences 
between the groups must apparently evolve 
from the discovery of differences in com- 
position of the groups other than the dimen- 
sion of severity. One obvious disparity is 
that the two groups differ in the number 
of patients included who have cavitary dis- 
ease, All of the patients with far advanced 
disease have cavities and in most cases they 
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are bilateral. Fifty percent of the patients 
of the group with moderately advanced 
disease, however, have no cavitation and 
there are no instances of bilateral cavitation. 
If cavitation is indeed the essential differ- 
ence between the groups, it might be con- 
sidered that the results suggest that the 
primary effect of psychological factors on 
tuberculous disease is exerted on the process 
by which cavity tissue damage is resolved. 

A more elaborate research project to test 
the hypothesized relationship between anx- 
iety and response to treatment is currently 
being pursued.!? The design of this investi- 
gation provides for psychological tests 
which are more sensitive to the anxiety 
factor. In addition, measures of autonomic 
nervous system and adrenal-pituitary func- 
tions are being obtained. These measures 
will be repeated at several stages in the 
treatment program and will be related to 
response to treatment using data similar to 
that used in the present study. 

The development of a multiple regression 
equation for the prediction of response to 
treatment for patients with far advanced 
disease from three of the Cattell scores con- 
stitutes a result of the present study that is 
primarily practical rather than theoretical. 
If the validity of the predictions is sup- 
ported by further investigation, they should 
allow for the use of therapeutic techniques 
to counter elements in the patient’s psycho- 
logical patterns which may mitigate against 
good response to treatment and allow for 
the achievement of optimum results in the 
treatment program. It is important to note, 
however, that the use of the prediction in 
its present state is subject to all of the 
cautions that must accompany data which 
have not been put to the crucial test of 
cross-validation. An essential followup of 
the present investigation is the use of the 
regression equation with a new group of 
patients and the computation of the statisti- 
cal relationship between the predicted re- 
sponse to treatment index scores and those 
obtained from the actual clinical records. 


12 Designated as Project 6 of the Cooperative 
Psychological Research Program. 


ADJUSTMENT ON RETURN 
TO THE COMMUNITY 


Earlier sections have shown how tuber- 
culosis patients face a considerable challenge 
to their ability to cope with long-term hos- 
pitalization. On their return home they 
again must adjust; this time to appropriate 
family and community living. In spite of 
their expectations, the home situation will 
not have remained static. When one is pres- 
ent the environment frequently changes in 
small ways, but people meet these variations 
almost automatically. After being away for 
many months, however, these small changes 
may accumulate to give a noticeably differ- 
ent aspect to the situation. 

In addition to this source of change, 
others may occur because of the patient’s 
absence. For example, circumstances may 
have required his wife to take over the 
role as head of the family in his absence. 
She may have had to accept some type of 
relief, or perhaps have gone to work to 
support the family. If this necessitated a 
baby sitter, that person may have earned a 
spot in the affections of the family during 
the patient’s absence. His wife’s increased 
self-reliance and sense of responsibility, and 
the money she has been earning, may be 
part of the changed reality he will face on 
returning home. 

Some of his readjustment problems in 
the community also may be compounded by 
the role the patient assumed in the hospital. 
The attitude of dependency, often found 
with long-term hospitalization, may prove 
difficult to shed as he resumes his role as 
head of the household. Many workers in 
this field can cite instances of another atti- 
tude which can be carried over from hos- 
pital days: a patient is worried about relapse 
and thus is afraid to start working again 
despite reassurances from his physician. 
Sometimes, another residual of his hospitali- 
zation may carry over in the form of an 


overreaction to the restriction on his inde- ; 


pendence which he felt as a patient. Such 
an individual may have difficulty in accept- 
ing normal authority or the usual limitations 
at home or on the job. 


A 


a Te ST 
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This phase of the project was designed 
to evaluate readjustment to the community, 
and to study the relationship of demo- 
graphic information, and of psychological 
test variables to this criterion. A first goal 
was to develop an index of community ad- 
justment which might be used in a number 
of comparisons. 

Adjustment to the community is here 
defined as a global response pattern which 
will enable an expatient to get along most 
smoothly and happily within the culture. A 
logical analysis would suggest that a good 
adjustment would find the discharged pa- 
tient playing an active and productive role 
in the community. He would be working if 
at all possible. He would be friendly and 
sociable rather than hostile or antisocial. 
Beyond assuming responsibility for himself 
and his family, he would work on whatever 
activities he could that might better his 
neighborhood and community. These be- 
haviors are some of the major dimensions 
one might expect from a community adjust- 
ment measure, if middle class views of what 
constitutes a successful adjustment are fol- 
lowed (Warner, Meeker, & Eells, 1949). 
Widely held values of most former patients 
and hospital staff would include working, 
being sociable, helping others, and feeling 
well, as part of a good community re- 
adjustment. 

Obviously the minority who did not hold 
these views prior to hospitalization would 
not be expected to change because of the 
time spent in the hospital per se. The 
investigation of relationships between pre- 
hospital and posthospital levels of adjust- 
ment will be part of a future study. The 
Present project is concerned with the rela- 
tionships between posthospital adjustment 
and psychological variables. 


Procedures 


The previously described projects of this cooper- 
ative study were conducted with hospitalized 
Patients who, although volunteers, were more 
available for study than those individuals who 
already had been discharged to their community. 
Also adjustment to hospitalization could be judged 
by a number of independent observers. Response 
to treatment while in the hospital could be judged 


by objective means such as laboratory tests or 
observed changes on X-rays. To obtain evidence 
as equally rigorous on their community readjust- 
ment would involve gathering information inde- 
pendently from family members, neighbors, friends, 
and employers, a procedure which was economically 
unfeasible. Instead, self-reports were obtained 
through fairly intensive and structured interviews 
with psychologists. These data, covering most 
aspects of the subject’s activities following his 
hospital discharge, were recorded by the psycholo- 
gist on a form during the course of about an 
hour's interview. 

Each interview was obtained at the time of the 
patient's visit to the hospital for a medical follow- 
up at least 6 months after hospital discharge. 
The median time lapse was 15 months after leaving 
the hospital. Just preceding the interview, these 
subjects also took the same battery of psychological 
tests used in the other two protocols. Data were 
thus collected on 185 male subjects from three 
follow-up clinics of cooperating hospitals. 


Results 


As was true for the other two studies, the 
data were processed in the central labora- 
tory. As a first step in processing, the vari- 
ous responses were divided rationally into 
two fairly even divisions for each of the 
interview questions. These dichotomies were 
used in computing tetrachoric correlations 
of every item with all the others. Fifty- 
seven of the 64 questions in the interview 
lent themselves to this treatment. From this 
table of intercorrelations two matrices were 
selected to use in factor analysis. Some 
groupings of high intercorrelations related 
to working, while others concerned not 
working. Since these groupings could not 
logically be correlated with each other, two 
matrices were needed. From these factor 
analyses emerged four more-or-less inde- 
pendent factors. They were labeled “А” 
and “В” from the one analysis, and "1," 
“2,” and “3” from the other (B and 1 were 
essentially the same factor found in both 


matrices) .!? 


13 The matrix of items associated with working 
was factored by James P. O'Conner, Veterans 
Administration Consultant from the Catholic Uni- 
versity of America. A complete description of the 
two factor analyses is included in a separate 
publication now in process. 
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Table 9 shows the interview items with 
significant loadings on each of the four 
factors. Factor scores were computed using 
the same procedure described in the section 
on hospital adjustment. Similarly, the dis- 
tribution of scores for each factor was 
dichotomized to permit the computation of 
tetrachoric correlations between factor 
scores and the personal history and psycho- 
logical variables. 

A study of the items with high loadings 
on each factor provides a subjective basis 
for the tentative identification of the be- 
havior pattern represented. Factor A would 
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appear to relate to Job Security or Stability. 4 
This factor, of course, came from the 
matrix dealing with job variables for those 
who were working. The second factor found 
in this group of items, Factor B, appears to 
be a reported feeling of good health. This 
same set of items also came up as Factor 1 
in the analysis of the other matrix, since 
the two overlapped except for the working 
or not working items. 

Factor 2 in the “not-working” matrix has 
been labeled tentatively Not Doing or In- 
ertia. Subjects high in this trait tend to 
miss medical rechecks, have no hobbies, 


TABLE 9 
Factors FOUND IN DISCHARGED PATIENT INTERVIEWS 
(N = 185) 
Factor 
Factor Variable Loading (rt) 
A Job Security—Stability 
Feels present job is secure .90% 
Does not want to change jobs .61* 
Present job chosen because of a positive attitude toward a specific, not 
personal aspect of the work .59 
No difficulty in getting a job .51 
Present total income from own salary only 47% 
Upward trend іп jobs since discharge 47 
Most leisure time spent in reading and music AT 
В-1 Medical—Health 
Full work tolerance—7 plus months after discharge .94 
Feels present health is very good .12 
No medical restrictions at discharge on type of work .66 
Currently lives with family .62 
No physical complaints at present .60 
No daytime rest recommended by physician at discharge .58 
Present total income from own salary only 2516 
No difficulty in getting a job 41* 
Total number of months on present job .39% 
2 Not Doing—Inertia 
Has missed one or more medical rechecks 1.00 
Has no hobbies -10 
Does drink less since discharge .52 
Not working but does want a job АТ" 
4 No Change Ѕіпсе Шпезѕ 
No changes in organization membership nit 
No changes in sports participation .59 
Does not live with family .58 
Drinks only beer and/or wine .49 
Does not drink less .37 
Does not smoke less .35 


а Variable omitted in subsequent computation of factor scores. 
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drink less since hospital discharge, and to 
be “not working, but want a job.” These 
items seem to describe a type of former 
patient seen by many rehabilitation workers. 
Such a client might be from a lower socio- 
economic class (to account for no hobbies), 
and to be so impressed by the seriousness 
of his medical condition, and so fearful of 
a relapse, that he will sit at home doing 
nothing for much longer than necessary. 
This factor might also include the passive, 
poorly motivated person. 

The remaining Factor 4, labeled No 
Change Since Illness, seems less defined 
than the other three. Four of the six items 
used to score it involve no change. Two 
diverse impressions could be formed from 
a scrutiny of the items with significant load- 
ings on this factor. The constellation might 
reflect little concern over the implications 
of the illness and a refusal to allow it to 
force any changes in previous living pat- 
terns, On the other hand, the factor might 
reflect the group who do not live with 
families, participate in sports, or belong to 
organizations; thus, there would be no 
change in these behaviors. As a part of this 
pattern, drinking and smoking are consid- 
ered inviolate rights, and not subject to 
change. This type of person is often found 
on the “skid row” of our cities. 

A few other factors were found on these 
analyses, but were not too well defined and 
included primarily complex variables with 
Significant loadings on other factors. 

Table 10 shows the interrelationships 
among the four major factors. Each relates 
to the one following it on the listing, and 


these correlations seem to make good sense 
psychologically. The .40 between Job Sta- 
bility and Health factors could suggest that 
good health keeps one on the job, as well 
as the obverse, that feeling happy and secure 
in one's job relaxes tensions and promotes 
health. 

The Health and Inertia factors correlate 
—.31. The items here suggest that those 
who have missed medical rechecks are per- 
haps showing а fear of what the physician 
might find as well as the inertia implied by 
the factor label. Their "drinking less since 
discharge from the hospital" and "not 
working yet wanting a job," also suggest 
people who lack confidence in their health 
status, which is more or less opposite from 
those high on Factor B-1, who exude con- 
fidence in their health. 

The last two factors, representing inertia 
and no change correlate —.51. The less the 
inertia the more there is no change since 
illness. If we accept the prior reasoning that 
the inertia shown is due to a fear of the 
consequences of activity on his health, and 
that the No Change factor shows a proper 
de-emphasis of the risk of a relapse, one 
would expect them to work against each 
other to a fairly strong degree. Some of 
these relationships may also be explained 
because of the one item common to both. 

Table 11 gives the tetrachoric correlations 
of the four factors just described, with the 
seven demographic variables and the 25 
psychological test variables. 

From a review of the obtained values, the 
Job Stability factor is significantly related to 
being married and in a white collar occupa- 


TABLE 10 


Factor INTERRELATIONSHIPS—DISCHARGED PATIENT INTERVIEW 


Factor Description 
A Job Security 
B4 Medical—Health 
2 Not Doing—Inertia 
4 No Change Since Illness 


Intercorrelations 
A В-1 2 4 
.40 —.16 —.07 
—.31 —.12 
—.51 
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TABLE 11 


CORRELATION MATRIX ОЕ DEMOGRAPHIC AND PSYCHOLOGICAL TEST VARIABLES WITH THE 
Four DISCHARGED PATIENT INTERVIEW FACTORS 


(N = 185) 
Factors 
Variable B-1 
А-1 Medical— 2 4 
Job Security Health Inertia No Change 
Demographic Data: 
Young age (39 and under) —.04 .44** —.13 .09 
Married .27* .34** —.09 ‚14 
Education (11 plus grades) .21* .28** — .40** .03 
Low compensation ($65 per month or less) .21* 13 —.15* ‚215% 
Occupation (white collar) .30** .22** —.32** —.12 
No previous TB hospitalizations —.02 .16* =. mU. 
No other medical diagnoses .05 .20** .09 —.22** 
Psychological Test Scores: 
PTI (intelligence) .20* .34** —.42** 04 
Cattell 16 PF test 
MD no distortion .19* —.13 .16* .20** 
A warm, outgoing .03 .05 —.11 —.16* 
B bright .10 +13 — .23** .09 
C mature .92** .42** —.04 .03 
E dominant .04 .00 = .12 .05 
F not depressed .18* .26** —.11 —.04 
G conscientious ie .16* —.17* .03 
H adventurous .06 .00 —.08 .04 
І not sensitive —.23** —.04 .04 —.06 
L  trustful .16* .28** —.09 —.05 
M conventional .12 Un .08 —.17* 
N sophisticated —.24** —.12 .02 —.03 
O not anxious 11 .03 .10 —.12 
Qı experimenting .15* .16* —.03 .08 
Q: self-sufficient .21* —.04 = .25** ‚13 
Оз controlled .07 .23** .07 —.16* 
О, tense .07 .10 ES .01 
Composite Projective Test Scores: 
Family important .25** .31** .06 —.20** 
Evasion —.10 .02 = .03 AS 
Passivity .10 —.16* .04 ‹22** 
Anxiety .02 .10 —.01 —.08 
Moral judgment .18* .20** —.07 .08 
Control .10 .22** —.17* :07 
Independent security .38** .18* —.28** —.05 


* Significant at the .05 level. 
*** Significant at the .01 level. 


tion. From the psychological test data these Factor B-1, a feeling of good health, is 
types of people can be described as emotion- related significantly to being younger, mat 
ally mature, conscientious, not sensitive, and ried, with a better education, a white collar 
sophisticated, as well as aware of the im- job, and no other medical diagnoses. The 
portance of family, and needing independent psychological tests suggest that these indi- 
security. viduals tend to be more intelligent, mature 
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not depressed, trustful, and controlled. They 
likewise believe in the importance of the 
family, and in judging others in terms of 
moral values. 

The Inertia or Do Nothing factor relates 
negatively to education, and white collar 
types of jobs. The factor also is related to 
dullness, dependency, and a low need for 
independent structure. 

In general all the correlations of Table 11 
are fairly low. The highest is .44. While 
34 relationships out of a possible 128 are 
statistically significant at about the 1% level, 
none of these is high enough to be of much 
practical value. 


The Development and Refinement of an 
Index of Community Adjustment. The sta- 
tistical analysis to this point has produced 
four main factors of what constitutes ad- 
justment to the community. Although some 
precision might be lost in the process, it was 
considered desirable to further condense 
these measures into a single scale which 
could be used in qualitatively rating the 
level of adjustment made by a former pa- 
tient. The possibility of such combination 
was enhanced by the interrelationships be- 
tween factors found in Table 10. Some of 
the correlations here and in Table 11 sug- 
gest, for example, that Factors A and B-1, 
involving feelings of job security and good 
health, would be on the positive end of an 
adjustment continuum, while the Inertia 
factor would be toward the opposite end 
of that scale. Factor 4, the No Change com- 
ponent, would appear on the negative side, 
but less extreme. 

The importance of providing for cross- 
validation of this index brings in a limiting 
consideration. The subjects used in this 
study were not studied while patients in the 
hospital, but only as they returned for medi- 
cal follow-up examinations. Those subjects 
used in studies reported in the previous two 
Sections, however, were involved in the 
midst of their hospitalization for tubercu- 
losis. Psychological test results as well as 
data on their adjustment to hospitalization 
and/or their response to treatment are 
already available. This group then appears 
to be the most appropriate to use in validat- 
ing an index of community adjustment 


which is to be developed from data of this 
study. Using that group will also permit 
comparisons between hospital adjustment 
and subsequent community readjustment on 
the same individuals. The psychometrics 
for this present study were usually given 
on the same day as the follow-up interview 
with the psychologist. Thus it represents an 
attempt at concurrent prediction. Since the 
proposed followup on those subjects studied 
first as hospital patients would take place a 
year or so later, this permits an attempt at 
prediction over time which would be more 
desirable. 

For followup on these former subjects, 
a more concise and less time consuming 
interview form was needed. It was also 
considered expedient that this form be pre- 
pared so it could be mailed out to those 
subjects who missed their follow-up ap- 
pointments. Where the original question- 
naire of 64 specific response items, on 10 
pages, took about an hour for the psycholo- 
gist to complete, the revised form contained 
12 open-ended questions on 3 pages. These 
can be filled out by either the psychologist 
or the subject in about 5 minutes. The sim- 
plified questionnaire did not include all the 
variables which contribute to the four factor 
scores, because the results of those factor 
analyses were not yet available when the 
revision was made. If an over-all Index of 
Community Adjustment is to be developed 
from the results of the factor analyses, and 
cross-validated on the subjects from Studies 
1 and 2, it should include only those items 
which are found in the revised and sim- 
plified questionnaire. Out of the 21 vari- 
ables used to compute factor scores, 10 were 
found to be available in the new question- 
naire. These 10 items were used to make up 
the Index of Community Adjustment, and 
are listed in Table 12. 

These questions selected from the Inertia 
and the No Change factors were reversed 
in their wording so as to give the index 
items a consistent positive direction. Thus, 
the higher the index score, the better the 
adjustment to the community. 


14 This is a study designated as Project 5 in the 
Cooperative Psychological Research Program. 
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TABLE 12 
ITEMS USED IN COMMUNITY ADJUSTMENT INDEX 
Factor 
Item Factor Source Loading 
Does not want to change jobs (plans to continue in present work) Job Security 61 
Leisure time spent in reading or music Job Security 37 
Feels present health is good Health .72 
Currently lives with family No Change (reversed) .53 
No physical complaints Health .60 
Full work tolerance—7 plus months after discharge Health .94 
Some changes since illness (re: organizational membership ог 
sports participation) No Change (reversed) 1 
Does smoke less since illness No Change (reversed) .35 
Has some hobbies Inertia (reversed) .70 
Plays some БАС now* Health 


^ This item was dropped from consideration in original factor matrices because of an r: of .99 with the work tolerance item, 


Table 13 shows the distribution of index 
scores. These appear to be fairly well dis- 
tributed with a tapering off at both ends, 
and they spread themselves throughout the 
full range of the available scores. 

Since this Community Adjustment index 
is to be used as a measuring device, its items 
were studied to make sure that each was 
contributing properly to the index score. 
The level of difficulty and the index of dis- 
crimination were used for this purpose.’® 

The index items in Table 14 are arranged 
in order of difficulty level (defined as per- 
centage scored plus) and vary from 82% 
to a low of 27%, with an average of 52%, 
which may be considered a fairly good dis- 
tribution. The index of discrimination is 
calculated as the difference in percentage 
scored plus between the top and bottom 
27% as judged by the total adjustment 
score. Because of only a 10-item scale the 
top and bottom 27% could only be approxi- 
mated, but the level of discrimination is 
quite high with 6 of the 10 items having a 
difference of +50% or better, and 9 of the 
10 showing +40% or better. The least dis- 
criminating item is “Most of his leisure time 
is spent at reading or music,” and this still 


15 Оп the interview record those factual items 
which were present and thus could be scored 
plus, were treated in the same way as if they were 
a question answered correctly on an achievement 
test. 


favors the higher group by 28%. Within 
the limits of the study so far, this index 
appears to be quite usable. 

A small pilot reliability study of the index 
was conducted under conditions which 
would be expected to lower the results ob- 
tained. The original extensive interviews 
with the psychologist at one hospital were 
used for the first calculation of the index 
scores. About 18 months later the revised 
and simplified form of the questionnaire 
was mailed to the same group at home. Of 
the 145 sent out, 75 were completed and 
returned. The correlation between the index 
scores calculated from these two question- 


TABLE 13 


DISTRIBUTION OF SCORES OF COMMUNITY 
ADJUSTMENT INDEX 
(N = 150 discharged cases) 


Score Frequency 
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naires was .81 which seems fairly good con- 
sidering the time lag, the changed form of 
the questionnaires, and the different method 
of administration. 

Information on the validity of the index 
would be very difficult to obtain since it 
would involve some independent value judg- 
ments as to the quality or level of the sub- 
ject's readjustment to the community. Those 
in the best position to observe this would 
also be those most liable to bias due to 
emotional involvement with the veteran. 
While a number of the items included in 
the scale could be verified by a trained inter- 
viewer visiting the home, such a program 
would not be feasible. 

The Index of Community Adjustment 
was then correlated with the biographical 
and psychological data. Results are given in 
"Table 15. 

All of the significant correlations are in 
the expected direction. That is, people tend 
to adjust better if they are more intelligent, 
have a higher level of education, are toward 
the professional end of the scale of occu- 
pations, and are younger. Higher levels of 
anxiety or neuroticism tend to interfere 
with successful readjustment to community 
life. Similarly, good adjustment scores are 


associated with maturity, nondepression, 
and emotional control. 

Within the framework of this present 
cooperative research project in tuberculosis, 
this index will serve a very useful role in 
studies of the relationship of type or level 
of adjustment to hospitalization, and of 
response to medical treatment to subsequent 
community adjustment. 


Summary 


This portion of the Veterans Administra- 
tion Cooperative Psychological Research 
Program was concerned with the readjust- 
ment of patients on return to their home 
community. Also to be investigated were 
the relationships between selected demo- 
graphic and psychological data and this 
community adjustment. An important phase 
of the work was the development of a single 
index score which would reflect the level 
of the subject's community adjustment. 

The procedure involved collecting data 
from the subject's hospital record, and ad- 
ministering a battery of psychological tests, 
followed by a long and structured interview 
covering a wide range of behaviors. Two 
factor analyses of the interview data yielded 


TABLE 14 


DIFFICULTY LEVEL AND INDEX OF DISCRIMINATION OF IrEMS USED IN THE 
INDEX OF COMMUNITY ADJUSTMENT 


Index of Discrimination 
Level of 
n di Bottom 20%| Top 31% 
scor онот 20%| Top 31% 
a Sie (Scores 0-2) |(Scores 6-10) Difference 
(N = 30) | (N = 47) 
Currently lives with family 82% 13 (43%) | 43 (91%) ve 
Full work tolerance 7 plus months after discharge 71 5 (17) 45 (96) 5 
No physical complaints 62 7 (23) 40 (85) i 
Does not want to change jobs 60 0 (0) 32 (68) 
Does have some hobbies 53 2 (7) 28 (60) z 
Most leisure time—reading or music 44 5 (17) 21 (45) " 
Feels present health is very good 44 2 (7) 37 (79) 
Does smoke less since illness 39 3 (13) 25 (53) 40 
Some changes since illness 35 7 23) 29 (62) 39 
Does participate in some sports 21 1) 25 (53) 50 
Average 52% 15% 69% 54% 


# 
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TABLE 15 


RELATIONSHIPS BETWEEN THE INDEX OF COMMUNITY 
ADJUSTMENT AND OTHER VARIABLES 


(N = 150) 
Index of 
Variable Community 
Adjustment 

Young age (under 40) .27** 
Married .33** 
Education .26** 
Occupation (white collar) .38** 
Low compensation .34** 
No previous TB admissions .30** 
No other medical diagnoses 215 
PTI (intelligence) .46** 
Cattell 16 PF test: 

MD no distortion —.08 

A warm, outgoing "y 

B bright .26** 

C mature .40** 

E dominant .05 

F not depressed .51** 

G  conscientious .00 

H adventurous .25** 

I not sensitive .01 

L trustful .03 

M conventional .10 

N sophisticated —.11 

O  notanxious .06 

Qı experimenting .22** 

Q: self-sufficient .16* 

Оз controlled .40** 

О. not tense .29** 
Composite Projective Test Scores: 

Family important .42** 

Evasion —.08 

Passivity .25** 

Anxiety .10 

Moral judgment 14 

Control .19* 

Independent security .38** 
Low combined anxiety scale score .42** 
Low combined neuroticism scale score .49** 


* Significant at the .05 level. 
** Significant at the .01 level. 


four main factors which seemed to relate 
to job security, health, inertia, and lack of 
change following illness. The interview 
items with high loadings on these factors 
and which could be scored from a shortened 
and simplified version of the interview form 
were combined to make a 10-item Commu- 


nity Adjustment index. The items of this 
index were studied for difficulty level and 
degree of discrimination. A short reliability 
study of the index also was reported. These 
procedures seem to indicate that the index 
is satisfactory. 

The relationships between selected psy- 
chological and background data and the 
Community Adjustment index were com- 
puted. The results of these procedures are 
what might be expected, with those subjects 
obtaining the highest scores who were 
younger, more intelligent, better educated, 
and at the white collar end of the occupa- 
tional scale. 

The psychological data associated with 
good community adjustment included non- 
distortion of the environment, maturity, no 
depression, low anxiety, nonneuroticism, 
good emotional control, and a high regard 
for the importance of the family. The 
results are consistent with a definition of 
good adjustment to a culture characterized 
by a predominantly middle class value 
system. 


SUMMARY AND IMPLICATIONS 


The three projects described in this mono- 
graph are the first in a planned series of 
studies of the psychological and social 
aspects of physical disease which is being 
undertaken cooperatively by psychologists 
in a number of Veterans Administration 
hospitals. The major rationale underlying 
these studies is that the manner in which 
an individual adjusts to the complex changes 
in his life pattern brought about by somatic 
illness depends to a large extent upon 
psychological and social factors in his life. 
Pulmonary tuberculosis is a particularly 
appropriate group for study because of the 
gross change in the patient's life pattern 
which is necessitated by current methods 
of treatment. Because the disease is COn- 
tagious, the patient is kept restricted eve? 
though feeling well. The treatment takes 
many months and often results in the dis- 
ruption of occupational and social adjust- 
ments. In short, tuberculosis presents 2 
great challenge to a patient's modes of 
adjustment. 


LJ 
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Each of the three projects concerned 
itself with a significant aspect of the process 
of adjusting to pulmonary tuberculosis and 
its treatment. The first studied the quality 
of the patient's adjustment to hospitaliza- 
tion. Independent ratings by his physician, 
the charge nurse, the aide who knew him 
best, and the patient himself were used to 
establish the criterion measure. 

The second area of study concerned the 
patient's response to treatment in terms of 
how rapidly he recovered. The criterion 
was based on X-ray evidence and on the 
results of laboratory tests showing the con- 
version of the patient's sputum or gastric 
content tó a noninfectious status. 

In the third study, the quality of the 
patient's adjustment on return to his home 
community became the criterion. This was 
judged from an intensive structured inter- 
view with a psychologist and was obtained 
when the patient returned to the hospital 
for a medical followup at least 6 months 
after leaving the hospital. 

Certain procedures were uniformly car- 
ried out on all of the subjects used for the 
three studies. The same medical, social, and 
demographic data forms were administered 
to all, as was an identical battery of psycho- 
logical tests. In the selection of the psycho- 
metric measures the basic requirements 
were that the tests should measure validly 
over a wide range of ability levels, be easy 
to administer and score, and provide a short 
testing session that would not overtire the 
patients. Many commonly used tests were 
inappropriate for use because of these 
Special requirements. The battery selected 
included a 5-minute, wide range intelli- 
gence test—the Personnel Test for Industry, 
Verbal Test A; the Cattell 16 Personality 
Factor Test—an objective personality meas- 
ure derived by factor analysis; а semi- 
Structured personality measure—the Madi- 
Son Sentence Completion Test which was 
Originally prepared especially for tuber- 
culosis patients; and finally, a projective 
test—the House-Tree-Person Test, of which 
only the person drawings were scored. 

. This project was viewed as a preliminary 
Investigation to screen a large mass of data 
for significant relationships. Later studies 


could then be undertaken to more specifi- 
cally and rigorously test hypotheses which 
are suggested by the results. In each section 
relationships between the psychological and 
demographic data and the criterion meas- 
ures were determined, and the implications 
were discussed. Each area will be sum- 
marized separately before interrelationships 
between the studies are discussed. 


Hospital Adjustment 


Hospital adjustment was seen to be a 
complex set of behaviors which were deter- 
mined by factors both within the individual 
and in his environment. Measures were 
needed both from the patient and from the 
hospital personnel who were familiar with 
his adjustment. A series of multiple-choice 
rating forms was devised and these were 
filled out not only by the patient but by 
the physicians, nurses, and aides who knew 
the patient best. 

In order to arrive at a single index of 
hospital adjustment which might be related 
to the other areas of the study, a rational 
analysis of all rating scale items was per- 
formed. Depending on the “goodness of 
adjustment” which each response choice 
reflected, a numerical value, ranging from 
+2 through —1, was assigned. On this 
basis, a total Hospital Adjustment Index 
score was computed. A sample of 350 non- 
psychiatric tuberculosis patients оп whom 
there was complete data was scored in this 
manner. 

In addition, a factor analysis of the hos- 
pital adjustment ratings was performed and 
six of the factors which were extracted 
entered. into further analysis: I. Coopera- 
tive, П. Positive attitude toward the hos- 
pital, Ш. Social activity, ТУ. Sleep and 
appetite problems, VII. Passivity to hos- 
pital, and X. Difficulty in accepting role 
of patient. Both the Hospital Adjustment 
Index and the scores derived from the fac- 
tors were then correlated with the demo- 
graphic and psychological test data. Al- 
though none of the resulting tetrachoric 
correlations was high enough to allow reli- 
able individual prediction, a number of 
statistically significant correlations seemed 


26 VERNIER, BARRELL, CUMMINGS, DICKERSON, лхо HOOPER 


to group themselves into clusters which 
could meaningfully be studied. These were 
discussed under the headings “Youth,” “At- 
titude toward the hospital,” and “Passivity.” 

Younger patients appear to be less co- 
operative with hospital rules and regulations 
and tend to have a negative attitude toward 
the hospital. They are socially active, how- 
ever, and generally without sleep and appe- 
tite problems. Patients with positive atti- 
tudes toward the hospital tend to be intel- 
lectually less bright and to lack “independent 
security.” Passivity seems to be associated 
with cooperativeness and ease in accepting 
the role of the tuberculous patient. The 
passive patient (as with persons who have 
a positive attitude toward the hospital) 
scores high on the Hospital Adjustment 
Index and is not endowed with much 
independent security. 


Response to Treatment 


The aim of this project was to explore 
the relationships between psychosocial data 
and the patient's response to medical treat- 
ment of his tuberculosis condition. Rela- 
tively rigid criteria for inclusion of a patient 
in this project resulted in an N of 78, drawn 
from the much larger patient population of 
the over-all study. 

"Three measures were selected and defined 
as important indications of progress in a 
patient's medical condition. The first was 
the rate of conversion of the patient's 
sputum or gastric content from showing 
evidence of the tubercle bacilli to a “nega- 
tive” state where none is found. The second 
was the number of months after the initia- 
tion of chemotherapy required to achieve 
X-ray stability, a condition of unchanging 
X-rays over a 3-month period. The third 
was the number of months required for 
cavities (if initially present) to be consid- 
ered closed by X-ray examination. The 
second and third measures were found to 
be highly correlated; therefore, only the 
former was used in an additive combination 
with the first measure to arrive at an Index 

of Response to Treatment. When this index 
was applied to the 78 patients, a significant 
mean difference was found between the 


patients with far advanced and moderately 
advanced disease. Thus, they were con- 
sidered separately in subsequently statistical 
analysis. 

No significant relationships were found 
between the psychological or demographic 
variables and response to treatment in the 
moderately advanced group. In the far 
advanced group, however, freedom from 
anxiety was found to be significantly related 
to good response to treatment. The data 
suggested that this freedom from anxiety 
was based on a secure personality structure, 
on effective use of defenses such as with- 
drawal, or on traits such as passivity and 
submissiveness. 

Of considerable interest—both theoretical 
and practical—is the disparity of results 
between the two diagnostic groups. It was 
felt that the difference between the groups 
may rest in differences in composition of 
the groups along other dimensions than that 
of severity. For example, an obvious differ- 
ence existed between the groups with respect 
to the incidence of cavitary disease ; perhaps 
the primary effect of psychological factors 
on tuberculous disease is exerted on the 
process by which cavitary tissue damage iS 
resolved. 

A multiple regression equation was con- 
structed from three factors of the 16 PF 
test. This provided an index which predicts 
response to treatment among patients with 
far advanced tuberculosis better than any 
single factor alone. If the level of predic- 
tion of this equation holds up in cross- 
validation, it should provide a very valuable 
aid in structuring the treatment setting for 
such patients. 


Community Adjustment 


The purpose of this project was to relate 
psychological and other variables to the 
posthospital community adjustment made by 
the patient after discharge from the hospital. 
In order to assess community adjustment, 
lengthy interview with each of 185 former 
patients was conducted. These individuals 
also took psychological tests, and studies 
were made of their medical folders. The 
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interview item responses were dichotomized 
and tetrachoric intercorrelations computed. 
Two factor analyses were then performed. 
One of the two factors extracted from the 
first and one of the three extracted from 
the second proved to be virtually identical. 
Thus, a total of four more-or-less inde- 
pendent factors was derived from the two 
matrices. These were Job Security-Stability, 
Medical-Health, Not Doing-Inertia, and No 
Change Since Illness. 

Ten items were drawn from the interview 
data, on the basis of their high factor load- 
ings on the four factors. These items were 
combined to form a Community Adjustment 
index and they were studied for difficulty 
level and degree of discrimination. The 
results of this and of a short reliability 
study of the index suggested that it was a 
satisfactory one for use in subsequent pro- 
cedures and computations. 

The index was then related to psycho- 
logical and demographic data. The former 
patients who were younger, more intelligent, 
better educated, and employed in white 
collar work appeared to be making the better 
posthospital adjustment. With respect to 
psychological tests, no distortion of the 
environment, maturity, lack of depression, 
low anxiety, nonneuroticism, good emotional 
control, and a high regard for the impor- 
tance of the family were associated with 
better community adjustment following hos- 
pitalization. These findings were seen as 
consistent with the definition of good ad- 
justment to a culture characterized by a 
predominantly middle class value system. 


Relationship. between Hospital Adjustment 
and Response to Treatment 


_It is of interest to ask whether good hos- 
pital adjustment is related to good progress 
in recovery from tuberculosis. To answer 
this question the index measures obtained 
in the studies of these two areas of behavior 
were correlated. A phi correlation of .00 
Was obtained for the moderately advanced 
disease group. The correlation for the far 
advanced group, on the other hand, was .40 
and the chi square 4.82. This relationship 
1$ significant at the .05 level. 


It is obviously not possible for this study 
to answer the question as to what is the 
underlying basis for the correlation which 
is found in the far advanced group or why 
a similar correlation is not obtained for the 
patients with moderate disease. It might be 
postulated that the correlation is evidence 
that hospital rules and regulations have 
truly been arranged so as to facilitate the 
recovery process. It is perhaps more prob- 
able that the correlation reflects common 
factors which are related to each of the 
indices. One such factor might be the rela- 
tive freedom from anxiety which has been 
found to be significantly related to both 
hospital adjustment and response to treat- 
ment. In either case, it may be comforting 
to hospital administrative personnel to learn 
that—at least for this group—their insist- 
ance on compliance with hospital regulations 
may be justified by medical considerations. 


Synthesis of the Three Studies 


At the outset of discussion of the findings 
themselves, it is important to note that the 
results of the studies are, in general, based 
on correlations which are statistically sig- 
nificant but do not indicate a sufficiently 
high relationship between the variables to 
provide a basis for adequate individual pre- 
diction. At the same time, patterns of sig- 
nificant correlations can often suggest fruit- 
ful lines for further research, They can 
also have a value in the clinical sense, to 
the degree to which they suggest courses 
of patient management which are not being 
currently used but which make good psycho- 
logical sense. 

The patient who adjusts well to the hos- 
pital has been described as being older, 
more passive, and less intelligent than the 
patient who makes a poor adjustment. The 
“good patient" is apparently a person whose 
needs are being adequately met in the rather 
regimented, closely supervised atmosphere 
of the tuberculosis ward. 

Although the sample of patients in the 
community adjustment study was a different 
one from that on which the hospital adjust- 
ment study was done, its members represent 
the same population of veterans with tuber- 
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culosis. Some direct comparison is, there- 
fore, permissible. Interestingly, the dis- 
charged patients who were adjusting better 
to the community shared several traits with 
those hospitalized patients who were making 
a relatively poor adjustment to the institu- 
tion. The younger, more independent, and 
intellectually brighter individual was found 
to be the one who was best meeting the 
criteria of good reintegration into the com- 
munity life. 

A tentative relating of the contradiction 
presented by the difference between the 
person who adjusts well in the hospital and 
the one who adjusts well in the community 
may be found in the differing needs and 
personality traits which can make for sat- 
isfaction in these relatively different en- 
vironments. The same factors that make it 
possible for an individual to adequately cope 
with the demands of everyday life in the 
community may make it very difficult for 
him to accept the prolonged inactivity, the 
highly structured and dependent atmos- 
phere, and the goal postponement involved 
in the hospital treatment of tuberculosis. 
Conversely, certain traits such as passivity, 
submissiveness, and lack of clear-cut goals 
(while auguring badly for adequate coping 
with competitive middle class community 
living) may be well satisfied in the tuber- 
culosis hospital setting. 

In contrast to the relatively sparse num- 
ber of significant relationships between the 
psychosocial variables and the criteria for 
hospital adjustment and response to treat- 
ment, many psychological and demographic 
variables were found to be related to good 
community adjustment. The person who 
makes a good community adjustment fol- 
lowing discharge from the hospital has been 
described as being younger, more intelligent, 
better educated, working at a white collar 
occupation, and showing maturity, lack of 
depression and anxiety, good emotional con- 
trol, lack of neuroticism, and a high regard 
for the importance of the family. These are 
the characteristics of the individual which 
society values and rewards with success. At 
the same time, it may be concluded that 
these personality characteristics are not con- 


ducive to good hospital adjustment or re- 
sponse to treatment for tuberculosis. 

Several of the factors of the 16 PF test 
of personality are of special interest as the 
three studies are compared. "Experiment- 
ing" individuals tend to have poor hospital 
adjustment but good response to treatment. 
“Nondistorting” is found to be related to 
both good response to treatment and good 
posthospital adjustment but tends to be 
negatively related to hospital adjustment. 
It might be hypothesized that experimenting 
and nondistorting represent drives toward 
mastery of the environment—the obverse of 
the passivity that is seen in the patient with 
good hospital adjustment. 

Only one of the psychological variables— 
anxiety—was found to be related to all 
three of the criterion measures. Anxiety 
appears to play an important role in poor 
adjustment as seen both in the hospital and 
in the posthospital situations. Further, 
among those hospitalized patients with far 
advanced disease, the presence of anxiety 
was significantly related to less satisfactory 
response to the medical treatment of the 
disease process itself. These results lend 
some confirmation to the concept that anx- 
iety is a central psychological variable in 
determining a wide variety of behaviors. 


Implications 


Turning to the implications of the studies 
which have been reported in this mono- 
graph, some very important considerations 
concerning patient management present 
themselves. The finding that quite different 
personality traits are related to adjustment 
in the hospital as compared to adjustment 
in the community following discharge sug 
gests that further attention should be paid 
to the kinds of need fulfillment which are 
found in these two situations. Great 1m- 
portance becomes attached to providing the 
young, intelligent, “poor hospital adjuster 
with sufficient sources of gratification for 
those needs on which his feelings of self- 
worth and adequacy are based. In gener? 
he must be helped to preserve in the hospital 
the family and community roles on whic 
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the effectiveness of his posthospital adjust- 
ment will depend. On the other hand, the 
extended period of hospitalization can also 
be used to aid the passive, submissive indi- 
vidual, who is perhaps too contented with 
hospitalization, to develop personal and so- 
cial attitudes and industrial skills which may 
better equip him to cope with posthospital 
life. 

Quite probably the significant relation- 
ships between anxiety and the three criteria 
areas represent only a slightly different 
aspect of the above considerations. The 
need to maintain anxiety at a minimal state 
makes sense both logically and psychologi- 
cally. This practical implication finds its 
most direct support in the indicated relation- 
ship between anxiety and the resolution of 
the physical disease process itself. 

It is axiomatic that, as research studies 
progress, many new areas of investigation 
present themselves. This has certainly been 
the case in the large scale project that has 
been described, Redefining and refining the 
criteria used is certainly needed; closely 
associated with this is the need for the 
development of better measuring techniques, 
both for the criteria variables and for the 
psychological and psychosocial data. 

Certain relatively specific questions de- 
mand further study, whether it be simple 
replication for the purpose of cross-valida- 
tion or new research projects based on 
hypotheses generated from the current find- 


ings. The hypothesized relationship between 
anxiety and response to treatment has led 
to a new project that is now under way. 
The design of this new study provides for 
the utilization of psychological tests more 
sensitive to the anxiety factor and also 
includes measures of the functioning of the 
autonomic nervous system and adrenal- 
pituitary output under the chronic stress 
situation which hospitalization frequently 
engenders. Further investigation of the 
environment which the hospital affords the 
patient is also being carried out. These 
projects will include the development of 
more refined measures of patient and staff 
attitudes. It is hoped that such studies will 
provide further insights into the factors 
involved in compliance and adjusting be- 
havior in various institutional settings. 

Finally, the application of the general 
research design which was used in these 
projects to the study of patients with dif- 
ferent types of illnesses promises new in- 
sights into the relationships among diseases 
in which psychological variables may play 
important roles. The results of a series of 
research projects of this type would permit 
the kinds of cross-disease comparisons 
which will indicate whether the results of 
the projects which have been reported are 
characteristic of patients with tuberculosis 
or represent more generalized features of 
the psychological counterparts of chronic 
physical disease. 
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, APPENDIX 


COMPOSITE PROJECTIVE Test Factors 


The semistructured, 20-item, Madison Sen- 
tence Completion Test, and the person drawing 
from the House-Tree-Person Test were used to 
obtain the projective test data in these studies. 
These provided three sources of scorable data. 


Language Characteristics 


The writing used to complete the Sentence 
Completion Test was studied for important 
variables and 21 were selected such as: type 
of writing (ie. print, script, or both); sen- 
tences omitted; percentage of personal pro- 
nouns, etc. 


Sentence Content 


An appropriate and meaningful classification 
of the content of the answers to the sentence 
completion items was developed through the 
study of the frequency distribution of those 
items. Sentence 4 with its scoring scheme may 
be considered typical of these. 


> 


. When the odds are against you. 

a. Vigorous positive action (fight harder) 
b. Action negatively stated (don't give up) 
c 


. Half hearted action (do the best you can, 
try to do something) 


. Passive acceptance (be patient, forget it) 


о e 


. Religion (pray, have faith) 


f£. Philosophical statement (could be worse, 
worry does not help, a test of our re- 


sources) 

к. Negative emotion (worry, frustrated, 
downhearted) 

h. Hopeless situation (no escape, nothing 
you can do) 


Person Drawing Characteristics 

Objectively observable aspects of the person 
drawing were scored as present or absent. This 
included some 35 items such as hands omitted, 
hair on the head, body shading present, an 
unseeing eye (empty circle), etc. 

The test protocols for 500 hospitalized sub- 
jects were used in devising the scoring vari- 
ables. After this was done, each test was then 
scored for all these 76 variables. For each 
answer a meaningful cutting point was selected 
to divide the distribution in two fairly equal 
parts. Then, using tetrachoric correlation, each 
was compared with all the others. The result- 
ing 76 by 76 correlation matrix was the source 
of the groupings of items that make up the 
composite projectives. Groupings of items 
which significantly relate to each other were 
selected by inspection, and named according to 
the apparent psychological meaning suggested 
by their content. Table A1 shows the items 
which contribute to each of the composite 
scores and their intercorrelation within each 


factor. 
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TABLE A1 


ITEM GROUPINGS AND THEIR INTERCORRELATIONS ON PROJECTIVE TEST VARIABLES 
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Item Groupings 


Intercorrelations (ri) 


. Family Importance #14 #20 
SC#12. I worry most about—family Al .39 
$С#14. My family responsibilities many, most 

important 36 
SC#20. I feel happiest when—visitors, at home 
‚ Evasion Nonpersonal 
pronouns #10 #2 #19 
SC. Less than 13% nouns 33 .29 -^ - 
SC. Over 7% nonpersonal pronouns .32 - л 
SC#10. One сап use liquor—when needed, anytime 3vUUM 133 
SC#2. When first told of my TB—denial, no » 
feeling .26 
$С#19. I have most confidence—not the doctor 
3. Passivity #7 
SC#1. Lying in bed makes me—happy, relaxed, 
rested well ES 
SC#7. Having TB wouldn't be so bad if—not more 
activity, had specific things 
4. Anxiety Broken Variable 
pressure retracing 
Person drawing: Broken lines 50 7 24 
Variable line pressure ‚69 
Retracing 
= 
5. Moral Judgment #9 #11 
SC#3. I like a doctor who—frank, honest 31 .29 
SC#9. The sort of person I like is—basic character .52 
SC#11. The sort of person I don't like is—basic 
character defect 
. Control Vertical Vertical Arms 
midline down 
Person drawing: Belt present E E 
Vertical midline 395 
Arms straight down 
Independent Security Added Ground 
details line 
SC#17. When people push me around—not "don't 
like" 49 AT 
Person drawing: Additional details We 


Ground line 


Note. — SC indicates sentence completion. 
a Value was not computed. 
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contribute to an understanding of the 
sources of variation in outcomes of group 
psychotherapy. More specifically it was de- 
signed to clarify the relations existing in 
group psychotherapy between certain char- 
acteristics of the therapists responses and 
patient intrapersonal exploration, and cer- 
tain characteristics of the group atmosphere 
and patient self-exploration. 

In recent years considerable research has 
been accumulated which supports the con- 
clusion that group psychotherapy can facili- 
tate constructive personality or behavioral 
change in persons described as mentally ill, 
alcoholic, emotionally disturbed, or psychotic 
(Baehr, 1954; Cadman, Misbach, & Brown, 
1954; Ends & Page, 1957; Fleming & 
Snyder, 1947; Geller, 1950; Gorlow, Hoch, 
& Telschow, 1952; Gosline, 1951; Graeber, 
Brown, Pillsbury, & Enterline, 1954; Gurri 
& Chasen, 1948; Hobbs & Pascal, 1946; 
Klapman, 1951; Mann’ & Semrad, 1948; 
Peres, 1947; Peyman, 1956; Powdermaker 
& Frank, 1953; Sacks & Berger, 1954; Slav- 
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under sponsorship of Carl R. Rogers at the Uni- 
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eration of the staff of Mendota State Hospital, 
particularly Gilbert В, Tybring and Forrest Orr. 
Appreciation is also extended to the following 
persons who contributed heavily to the present 
research as therapist or sample judge: Allyn 
Roberts, Frank Farrelly, Ed Williams, and Emily 
Early. This research was begun while the author 
Was a staff psychologist at Mendota State Hospital 
and completed during a research associate appoint- 
ment at Iowa Child Welfare Research Station, 
State University of Iowa. & 


* Now at the Psychotherapy Research Section, 
Psychiatric Institute, University of Wisconsin. а 


son, 1956; Tucker, 1956). An examination 
of this evidence, however, reveals that 
group psychotherapy is not uniformly facili- 
tative of constructive personality change in 
patient members. Some group psycho- 
therapy groups do not reliably facilitate con- 
structive change in any members: some 
individual members of a group are facili- 
tated- in constructive change while other 
members of the same group appear to be 
unaffected by group psychotherapy. 

In a research study designed to explore 
sources of variation in outcomes between 
groups, Ends and Page (1957) focused 
upon differences related to the orientation 
of the therapist. Using a latin square de- 
sign, four therapists used. four different 
theoretical orientations in the resulting 16 
groups: a didactic learning theory approach, 
a leaderless-group approach, a neopsycho- 
analytic approach, and a client centered 
approach. The member patients were hos- 
pitalized alcoholics, so that available follow- 
up data dealt with continuance or remission 
of alcoholic behavior specifically rather than 
with more general outcomes of group psy- 
chotherapy. The reported results, although 
based upon. relatively small samples of 
patients, due to а very high drop-out rate, 
indicated that the client centered approach 
resulted. in significantly greater remission 
of alcoholism when compared to either the 
didactic learning approach or the use of 
leaderless groups. Examination of the data 
reveals no difference between the client 
centered and the: analytic: approaches in re- 
mission rates, with both yielding ratios of 
remission to nonremission of approximately 
1:2 while ratios of approximately 1:5 are 
obtained for the didactic learning and lead- - 
erless-group approaches. ў 
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"These results strongly suggest that activi- 
ties of the therapist are significantly related 
to variations in outcomes between group 
psychotherapy groups. The present author's 
personal observations of group psycho- 
therapy suggest that the behavior of the 
therapist varies from session to session and 
from patient to patient despite attempts to 
maintain a specific therapeutic approach. If 
these observations are indeed valid, then it 
might be expected that variations in out- 
comes within a group would be related to 
variation of the therapist's responses within 
the group. 

Some evidence for this expected relation- 
ship is available from a study by Hobbs 
and Pascal (1946), which was partially 
reported by Gorlow et al. (1952). They 
analyzed verbatim transcripts by classifying 
patient responses as either therapeutically 
positive or therapeutically negative, and, 
therapist responses as either client centered, 
eclectic, or didactic-authoritarian. Thus they 
were studying relationships internal to 
group psychotherapy sessions rather than 
setting up distinctions between groups. 
Their reported results indicated that both 
client centered and eclectic responses by the 
therapist were more highly associated with 
positive therapeutic statements by patients 
than didactic-authoritarian responses by the 
therapist. 

The results of the Hobbs and Pascal and 
the Ends and Page studies together point 
to the activities of the therapist as a signifi- 
cant source of variation in outcomes in 
group psychotherapy. Additionally, the re- 
sults of both studies indicate that a didactic- 
authoritarian approach is relatively ineffec- 
tive when compared to eclectic, analytic, or 
client centered approaches, This latter find- 
ing suggests a fruitful approach to the 
-difficult problem of obtaining valid and 
sensitive measures of outcomes. The effec- 
tive therapeutic approaches have in common 
the goal of facilitating self-exploration, 
while didactic and authoritarian approaches 
would be expected to inhibit self-explora- 
tion. 

It is here postulated that intrapersonal 
exploration is a sufficient antecedent con- 
dition for constructive personality change 


in psychotherapy. If this statement is verid- 
ical, then the study of group psychotherapy 
can be simplified by focusing upon condi- 
tions which facilitate intrapersonal explora- 
tion, rather than dealing with a large num- 
ber of specific and value-laden changes in 
behavior. 

Clinical observations support such a state- 
ment. In successful psychotherapy, both 
individual and group, the patient is involved 
in a process of intrapersonal exploration— 
a process of coming to understand one’s 
beliefs, values, motives, and actions—while 
the therapist by reason of his training and 
knowledge of psychology is attempting to 
facilitate this process. In the terminology 
of psychoanalytic theory this process is de- 
scribed as the patient becoming aware of 
or exploring unconscious material and the 
distortion effects of that unconscious mate- 
rial upon perception of reality (Munroe, 
1955). For client centered theory this 
optimal therapy has meant ап exploration of 
increasingly strange and unknown and dangerous 
feelings in himself. . . . thus he becomes acquainted 
with elements of his experiences which have in 
the past been denied to awareness as too threat- 
ening, too damaging to the structure of the self 
(Rogers, 1955, unpublished). 

In addition to clinical findings, there 
exists considerable research evidence to sup- 
port the hypothesis that intrapersonal ех- 
ploration is a sufficient condition for con- 
structive personality change. This evidence 
is largely available from studies of in- 
dividual psychotherapy, although Peres 
(1947) reports a study of client centere 
group psychotherapy comparing patients 
who benefited from therapy with patients 
who showed no detectable benefit. Using 
tape recordings of group psychotherapy 565" 
sions, she classified patient statements into 
those referring to personal problems an 
those not referring to personal problems. 
The results indicated that both groups made 
equal numbers of references to persona 
problems early in therapy, but that the bene- 
fited group made significantly more persona 
references in the last half of therapy, while 
the nonbenefited group made fewer SUC 
personal references. Considering all ses 
sions combined, the benefited group made 
almost twice as many personal references 


| 
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as did the nonbenefited group. 

Using data obtained from individual psy- 
chotherapy, Braaten (1958) reports that 
when early and late interviews from more 
successful cases are compared with early 
and late interviews from less successful 
cases, the more successful cases show a 
significantly greater increase in the amount 
of self-references. Also, expression of the 
private self—his awareness of being and 
functioning, his internal communication— 
also increased significantly more in the suc- 
cessful than in the less successful cases. 

Tomlinson (1959), in a very recent study 

comparing more and less successful psycho- 
therapy cases, used the Process Scale to 
evaluate differential changes from early to 
late interviews. This scale was devised by 
Rogers and Rablen (1958, unpublished) to 
measure quantitatively the amount and ex- 
tent of intrapersonal exploration, The re- 
sults indicated that the more successful 
patients showed a significantly greater in- 
crease in the amount and extent of intra- 
personal exploration from early to late 
interviews when compared to the less suc- 
cessful cases. 
_ As early as 1948 a study of client centered 
individual psychotherapy by Steele (1948) 
indicated that, in comparing more with less 
Successful cases, the more successful clients 
increasingly explore their problems аз 
therapy proceeds, while less successful cli- 
ents explore their problems less as therapy 
proceeds. Similar results are reported by 
Wolfson (1949), while supporting evidence 
is available in the reports of research by 
Seeman (1949) and Blau (1953). 

The above studies, then, are interpreted 
as confirming evidence for the conclusion 
that intrapersonal exploration is a sufficient 
antecedent condition for the consequence of 
constructive personality change. There is, 
in the opinion of the present author, con- 
siderable heuristic value in such a conclusion 
Since intrapersonal exploration is much 
more amenable to operational definition and 
measurement than is the value-laden concept 
of constructive personality change. Further, 
intrapersonal exploration can be measured 
еу from the psychotherapy interview 

ata. 


General Outline of Research Approach 


From the above discussion three general 
suggestions for research into the sources of 
variations in outcomes of group psycho- 
therapy emerge: (a) an investigation of the 
therapeutic group psychotherapy process 
can fruitfully proceed by investigating con- 
ditions which facilitate intrapersonal ex- 
ploration in the group setting; (5) the in- 
vestigation of facilitative conditions can 
fruitfully be focused upon the character- 
istics of the therapist's responses; and (c) 
the characteristics of the therapists re- 
sponses most likely to facilitate intraper- 
sonal exploration are those common to 
client centered, eclectic, and analytic thera- 
pists and not held in common by didactic, 
authoritarian, or leaderless-group therapists. 
These suggestions are equally applicable to 
individual and group psychotherapy. Ob- 
servations by the present author, Slavson 
(1956), and Powdermaker and Frank 
(1953) suggest that a single patient in the 
group psychotherapy setting is not only 
responded to by the therapist, but also by 
other patient members. It would seem rea- 
sonable, then, to expect that the investiga- 
tion of facilitative conditions can fruitfully 
be focused also upon such responses by 
patient members. Such interpersonal inter- 
actions among the group members can be 
conceptually abstracted, and hence meas- 
ured, as group characteristics. 

The present research investigation, then, 
was designed to evaluate statistically the 
relationships between intrapersonal explora- 
tion and specific hypothesized therapist con- 
ditions (characteristics of the therapist's 
responses) and specific hypothesized group 
conditions (characteristics of the group at- 
mosphere abstracted from interpersonal 
interactions among group members). 


HYPOTHESIZED THERAPEUTIC CONDITIONS 


Therapist Conditions 


Both psychoanalytic (Alexander, 1948; 
Ferenczi, 1950) and client centered (Rog- 
ers, 1951, 1957) therapists have emphasized 
the importance of positive warmth and 
acceptance of the patient by the therapist 
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and his understanding attitude toward the 
patient, also both the analytic and client 
centered theorists have emphasized that the 
therapist be mature and integrated in the 
therapeutic relationship. These character- 
istics of the therapist have been presented 
in an organized theoretical statement by 
Rogers (1957), in which he hypothesizes 
that three characteristics of the therapist in 
the therapeutic relationship, when ade- 
quately communicated to the patient, are 
both necessary and sufficient conditions for 
constructive personality change: empathic 
understanding of the patient by the thera- 
pist, unconditional positive regard for the 
patient by the therapist, and the genuineness 
or self-congruence of the therapist in the 
relationship. 

While it would be difficult to establish the 
necessity of these three therapist conditions 
—indeed the author would agree with Ellis 
(1959) that any specific condition is un- 
likely to be necessary—there is evidence to 
suggest that these conditions are thera- 
peutically relevant. In a study of individual 
psychotherapy, Halkides (1958), selected 
10 cases of most successful and 10 cases of 
least successful patients and then selected 
early and late interviews from which she 
randomly sampled interaction units. Her 
results indicated significant relations be- 
tween each of the hypothesized three condi- 
tions and success and nonsuccess in therapy. 
In a more recent study Barrett-Lennard 
(1959) developed scales to measure these 
three therapist conditions by means of an 
inventory taken by the patient after the 
fifth interview. His results indicated that 
more experienced therapists were perceived 
by patients as having greater empathy, un- 
conditional positive regard, and genuineness. 
Further, in the more psychologically dis- 
turbed patients of his sample of clients, 
these conditions were significantly associ- 
ated with successful therapy. In both studies 
high correlations were also obtained be- 
tween the three conditions, so that it is 
questionable whether or not the three thera- 
pist conditions contributed independent 
sources of variation in outcomes. 

The emphasis of psychoanalytic writers 
in discussions of understanding the patient 


is differentiable from that of client centered 
writers in their stress upon diagnostic ac- 
curacy or sensitivity to feelings or experi- 
ences, rather than upon a sharing quality 
(Alexander, 1948; Ferenczi, 1930). Thus 
a fourth therapist condition hypothesized to 
relate directly to intrapersonal exploration 
is a combination of psychoanalytic and | 
client centered views. This is the accuracy 
of the therapist's response to the patient's 
feelings or experiences, accurate empathy. 
The importance of accuracy of the thera- 
pist’s response has only been slightly 
touched upon by research. However, Gil- 
lespie (1953) reports that verbal signs of 
resistance in client centered therapy, exclud- 
ing within-client signs, are preceded by 
therapist errors of inaccurate clarification 
or interpretation. 

In analyzing empathic ability or a general 
tendency to have warm feeling and liking 
for one’s patient, several writers have 
pointed to the role of assumed similarity 
(the tendency for a judge to describe him- 
self and the stimulus object in the same 
way). The results of studies by Bender 
and Hastorf (1953), Cronbach (1955), and 
Rodgers (1959) suggest that assumed simi- 
larity may be an underlying determiner of 
both accurate empathy and unconditional 
positive regard. Further, the results indi- 
cate that when favorability is controlled, 
there is no relationship between assumed 
similarity and real or actual similarity. That 
is, these findings suggest that a therapist 
may communicate positive warmth and em- 
pathic understanding to a patient only when 
he assumes a similarity between himself and 
the patient. On the basis of the authors 
observations of group psychotherapy le 
by a wide variety of therapists, a hig 
degree of assumed similarity by the thera- 
pist between self and patient seems to mini 
mize the patient’s fears of others thinking 
his inner thoughts to be strange or crazy: 
Such fears in the author's experience, a7 
a prime source of inhibition of intrapersona 
exploration in group psychotherapy. 

It will be remembered that in Кореб 
statement of the necessary and suficien 
conditions for constructive personality Я 
change, emphasis was placed upon the СОЙ 
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munication of these therapist attitudes to 
the patient. It might be expected that, other 
things being equal, the more frequent the 
initiation of communication by the therapist, 
the more these three therapist conditions 
will be communicated. Further, greater re- 
sponsivity of the therapist would be ex- 
pected to communicate greater interest and 
ego involvement of the therapist to the 
patient. A research study by Dittman 
(1952) into the process of individual psy- 
chotherapy deals directly with the degree of 
therapist participation or responsivity. 
When the patient statements are classified 
into retrogressive and progressive move- 
ment, it was found that high participation 
by the therapist is associated with progres- 
sive movement of the patient. 

A final therapist condition here hypothe- 
sized to facilitate intrapersonal exploration 
in group psychotherapy is the degree of 
leadership provided by the therapist for the 
group. It is assumed that leadership by the 
therapist is necessary to develop facilitative 
group conditions such as those described 
below. Although direct research evidence is 
lacking, the findings of Ends and Page of 
the therapeutic ineffectiveness of leaderless 
groups suggest the importance of leadership 
as a relevant variable in the study of group 
psychotherapy. 

Specifically, then, the following seven 
therapist conditions are hypothesized to 
show a positive relationship to measures of 
intrapersonal exploration in group psycho- 
therapy : 

l. Empathic Understanding 
. Accurate Empathy 
. Genuineness or Self-Congruence 
. Unconditional Positive Regard 
. Assumed Similarity of self and patients 
. Responsivity 
‚ Leadership 
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Group Conditions 


In a theoretic consideration of effective 
variables in group psychotherapy it would 
seem desirable to classify group conditions, 
abstractions of characteristics of the group 
atmosphere, into two classes: conditions 
that are relatively under the direct control 
Of the therapist, and conditions that are 


indirectly influenced by the therapist. We 
shall consider the latter first. 

It might be anticipated that if genuine- 
ness or self-congruence on the part of the 
therapist is facilitative of intrapersonal ex- 
ploration in individual psychotherapy, then 
genuineness of group members would be 
facilitative in group psychotherapy. From 
the author's experience genuineness of the 
group members is only indirectly influenced 
by the therapist; he cannot require that 
patients drop their facade during the group 
psychotherapy session. 

Studies of small groups, and, particularly 
studies of attitude or opinion change in 
small groups, have suggested that group 
cohesiveness should be considered as a 
therapeutically relevant group condition. 
Cohesiveness of a group, which again is 
only indirectly influenced by the therapist, 
can be defined as the strength of all forces 
acting on the members to remain in the 
group. As such it would include the "liking" 
of the group members, the desire for relief 
from anxiety, pressure from family and 
friends, confidence in the therapist, etc. 
Although this variable has not been studied 
in relation to group psychotherapy, other 
research is relevant. Back (1951), studying 
experimental groups, reports that the 
greater the cohesiveness of the group the 
greater is the amount of influence exerted 
on members. Further, his results show that 
irrespective of the nature of the attraction 
to the group, members of high cohesive 
groups are less resistant to influence. Fes- 
tinger, Gerard, Hymovitch, Kelley, and 
Raven (1952) in studying deviates and 
conformists find that deviates in high co- 
hesive groups show less confidence in their 
opinions, and greater readiness to change 
their opinions. In the setting of group 
psychotherapy with schizophrenics this 
might suggest that delusional material is 
more easily given up by the patient in a 
high cohesive group than in a low cohesive 
group. 

A third characteristic of the group atmos- 
phere here hypothesized to relate positively 
to the amount of intrapersonal exploration 
is the degree of ego involvement of the 
group in the current discussion. Again, al- 
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though this has not been investigated as a 
therapeutically relevant variable it would 
be expected that ego involvement of the 
group members in the discussion would be 
reinforcing to the patient expressing his 
feelings or experiences, and would hence 
lead to further expression of feelings or 
experiences. 

The following group conditions, indirectly 
influenced by the therapist, are specifically 
hypothesized to show a positive relation- 
ship to intrapersonal exploration in group 
psychotherapy : 


1. Genuineness or Self-Congruence of the group 
members 


2. Cohesiveness of the group 
3. Ego Involvement of the group members in the 
discussion 

The degree to which the group is discuss- 
ing specific and concrete feelings or experi- 
ences, rather than general and abstract feel- 
ings or experiences, is relatively under the 
direct control of the therapist. It has been 
the author's observation that therapists in- 
deed do either point the discussion in group 
psychotherapy toward specific or toward 
abstract and general feelings or experiences. 
Although this characteristic of group psy- 
chotherapy has not been specifically dis- 
cussed as a variable in theories of psycho- 
therapy, or in research, the crucial impor- 
tance of this variable for psychotherapy is 
implied in the discussions of client centered 
and psychoanalytic theory. Freud's initial 
position stressed two points (Freud, 1950), 
both of which remain basic to analytic 
theory : the recovery of repressed memories, 
and the handling of repressed affects. Re- 
lease from repression is stated as essential 
to therapy. From Freud's discussion it is 
quite clear that even when these memories 
and affects are fantasy productions, they are 
specific and concrete and not abstract. In 
Rogers’ discussion of empathic understand- 
ing, too, there is reference to specific experi- 
encings of the patient rather than to abstrac- 
tions of experiencings (Rogers, 1951; 1955 
unpublished; 1958). Indeed, both client 
centered and analytic therapists generally 
regard a patient’s discussion of abstractions 
as defensive rather than exploratory. 


A second group condition relatively under 
direct control of the therapist that is hypoth- 
esized to show a positive relation to intra- 
personal exploration is the degree of de- 
individuation characteristic of the group. 
De-individuation refers to a response to 
others in terms of what the other is com- 
municating rather than a response to “who” 
the other is as a person. That is, de-indi- 
viduation is the opposite of personalization 
of interaction. It might be speculated that 
a response to the person tends to demand a 
reciprocal response which would inhibit 
intrapersonal exploration, while a response 
to the expressed feeling or experience would 
tend to demand further exploration. Al- 
though this variable has not been considered 
as therapeutically relevant either in theory 
or research, it has been used in research in 
experimental small groups. Festinger, Pepi- 
tone, and Newcomb (1952), report research 
results which indicate that under conditions 
of high de-individuation in a group setting, 
there is a reduction of inner restraint and 
an increase in reported satisfaction with the 
group experience when compared to condi- 
tions of low de-individuation. 

It might be expected that if empathic 
understanding of the patient by the thera- 
pist is facilitative of intrapersonal explora- 
tion, then empathic understanding by other 
patient-members in group psychotherapy 
would also be therapeutically relevant. 

A similar expectation might be held for 
the facilitative effects of unconditional posi- 
tive regard of the group toward its mem- 
bers. However, in this case it seems plaus- 
ible that this concept, as applied to char- 
acteristics of a group, might merge with a 
general supportive attitude of the group that 
may be therapeutically undesirable. That is, 
there appears to be a thin line between 
warmth, which is conceived to be facilita- 
tive, and an attitude of support and mini- 
mization of one's problem, which is Con- 
ceived to inhibit intrapersonal exploration. 
For this reason, three measures which have 
in common positive affect are included in 
the present study: unconditional positive 
regard of the group for its members, m 
operative and mutually helpful group 5116 
and group sociability. All three variables 
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focus upon positive affect within the group, 
and thus would be thought to operate by 
minimizing internal friction and maximiz- 
ing trusting relationships between group 
members. 

Since group unconditional positive regard 
would tend to have relatively more warmth 
and less supportive "minimization of prob- 
lems" than group sociability, a higher 
positive relationship to intrapersonal ex- 
ploration would be expected for the former. 

Although the effects of positive warmth 
have not been studied in group settings, the 
replicated study by Carter (1954) of experi- 
mental groups indicates that sociability is 
one of three factors which together grossly 
(adequately) describe interaction behavior 
of individuals in groups. For the present, 
then, it is assumed that all three variables 
are facilitative of group psychotherapy. 

The following group conditions relatively 
under direct control of the therapist, then, 
are hypothesized to show a positive relation 
to intrapersonal exploration: 

1. Concreteness or Specificity of expression of 

feelings and experiences 

2. De-individuation 


3. Empathic Understanding by the group of its 
members 


4. Unconditional Positive Regard by the group 
toward its members 

5. Group Spirit of Cooperation and Mutual 
Helpfulness 


6. Group Sociability 


PROCEDURE 


The data used in the present investigation 
to evaluate the hypotheses under study were 
3-minute samples of verbal interaction ob- 
tained from transcriptions of tape recorded 
Sessions of 42 successive hours of group 
Psychotherapy from each of three separate 
groups. The three groups from which 
samples were drawn were open-ended 
groups of hospitalized mental patients; each 
group was led by a different therapist who 
had had previous experience with hospital- 
ized patient groups. Measurements of intra- 
Personal exploration and of therapist and 
group conditions were obtained for each 
sample. The test of the hypotheses under 


study was the testing for statistical signifi- 
cance of association between the measures 
of conditions and the measures of intra- 
personal exploration. 

Using such a procedure, the findings of 
reliable relationships between hypothesized 
facilitative conditions and measures of intra- 
personal exploration, while not providing 
positive evidence of causal relations, would 
serve as support for such a priori hypotheses 
of dynamics. Further, the lack of such 
findings serves to make less tenable such 
hypotheses. The present simultaneous in- 
vestigation of a number of hypothesized 
facilitative conditions for intrapersonal ex- 
ploration is not only superior to separate 
investigations of single hypothesized condi- 
tions in economy of research effort, but it 
is also superior in allowing for comparisons 
of relative effectiveness of conditions and 
for testing of the independence of effects 
upon intrapersonal exploration. Thus a 
frontal approach to the difficult problem of 
overlapping concepts of therapeutic condi- 
tions can be made. 

The present: investigation, using verbal 
transcripts of the group interaction as the 
sample, will include both constant and vari- 
able errors that might be eliminated if 
motion pictures with sound tracks had been 
used. However, the presence of random 
errors operates only to reduce the magnitude 
of obtained relationships. The presence of 
absolute or constant errors operates only to 
change the absolute values obtained for a 
given variable—they do not affect the rela- 
tionship obtained between two measures as 
given by single or multiple regression tech- 
niques. Thus these two classes of errors in 
the present study might reduce the size of 
obtained relationships, but, they would never 
spuriously increase the size of such relation- 
ships. 

The possibility of systematic errors in- 
herent in any research, however, must be 
considered. In the present study systematic 
errors might arise from the use of raters 
who might be influenced in their ratings by 
their theoretical preconceptions. The pre- 
cautions used to minimize this possibility 
will be discussed in the section below de- 
voted to the rating procedure and judges. 
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Patients, Groups, and Institutional Setting 


Thirty-nine patients attended a majority of the 
42 hours of therapy, while an additional six pa- 
tients attended fewer than 4 sessions, АП patients 
were hospitalized at Mendota State Hospital for 
treatment of mental disorders; 17 patients were 
diagnosed as schizophrenic reactions, 5 as de- 
pressive reactions, while the remaining 17 members 
were classified as psychoneurotic reactions, char- 
acter disorders, epileptic disorder associated with 
paranoia, manic-depressive psychosis, and pseudo- 
neurotic schizophrenia, The patients lived on 
several wards. The group included patients in 
both closed and open wards, and in both intensive 
treatment wards where “therapeutic community” 
activities were available and on chronic wards 
where mainly custodial care was available. Pa- 
tients ranged in age from 14 years to 53 years, 
with most patients being in their late twenties and 
early thirties. Although a few of the patients had 
been hospitalized for many years, the majority had 
been hospitalized for less than 2 years, and 
approximately one-fourth for less than 6 months 
when the research began. Since the majority of 
patients were from wards in the same building 
where social activities were planned for several 
wards, almost all patients һай social contact with 
other group psychotherapy patients outside of the 
group setting. Member-patients reported that they 
would spend many hours each week in two- and 
three-person groups discussing and sharing their 
problems on the wards, so that associations estab- 
lished within the groups were apparently quite 
stable outside of the therapy sessions. 

Тһе 39 continuing patients formed two female 
groups of 15 and 11 members each and one male 
group of 13 members. The three therapists lead- 
ing the groups were psychologists on the hospital 
staff. The therapists differed in their approach to 
psychotherapy along almost all dimensions. It was 
felt that a fairly wide range of techniques was 
used in the three groups; the graduate training 
of the three therapists was obtained at three 
strongly differing universities. 

The three groups, taken together, can be con- 
sidered as relatively highly effective; 25 of the 39 
patients were subsequently discharged from the 
hospital and were out of the hospital at the time 
of a follow-up one year later. Moderate improve- 
ment was noted in eight of the remaining patients, 
while one patient appeared to regress during group 
psychotherapy, and a second patient who responded 
well to group therapy committed suicide 2 months 
after the termination of group psychotherapy. 


Sample Units 


Forty-two successive therapy hours for each of 
the three groups were tape recorded for purposes 
of the present investigation. The units obtained 
from the total of 126 recorded sessions of group 
therapy were typewritten transcriptions of 3-minute 


samples of group interaction obtained rando; 
from either the end of the first one-third or the 
end of the second one-third of each therapy hour, 
Thus the sample units focus upon the middle segs 
ment of therapy sessions rather than upon ve y 
early or very late sections of each session. The 
transcription samples began with the first state: 
ment by a "new" person after the designated. 
starting point in the tape, and continued beyond. 
the designated end point to finish out a respo 
Five of the actual samples used are presented 
the appendix. This procedure resulted іп som 
variability in the actual length of the samp! 
the mean sample length was 3 minutes and 
seconds, with a range of from 2 minutes and 
seconds to 3 minutes and 19 seconds. In two cases 
the original randomly selected sample unit coms 
tained no response by the therapist, so adja 


presence of the therapist's response. 


Names were deleted from the transcripts 
patients were identified only within each sessi 
by “Pi,” “Ps,” etc, in their order of appearance 
That is, P; would not likely be the same person 
in different samples. In the final typescripts, each 
sample was randomly assigned a code number foi 
identification so that prospective raters were gi 
no information as to which group a given S: 
came from, who the therapist was, or whethei 
the sample was taken from early or late therapy 
sessions. 

The transcriptions themselves, due to poo! 
enunciation typical of group therapy, simultaneoti 
talking by several members, and occasional eX 
traneous noises, were a major problem. A word 
for-word transcript was first obtained for ea 
sample. Then a second person listened to 
tape recording to check for word accuracy (ini 
few instances the patient himself was asked М 
listen to the tape to verify the word accuracy) 
Then the author rechecked the word accurac 
Finally, an attempt was made to supply descrip! tiv 
adjectives of the vocal qualities of the tape reco 
ing to the transcript. Descriptions such à 
“rapidly,” “sarcastically,” and “therapist and gro 
laughs uproariously,” were added to the transerl pt 
and then checked for agreement by а secom 
person who listened to the tape recordings. U! 
such a procedure it was felt that a high level 0 
accuracy was attained, and that the descrip 
words and phrases conveyed much of the vol 
quality available from the recordings. The fi 
transcripts were duplicated for use by raters: 


Rating Procedure and Judges 


The use of judges to obtain ratings of dimi 
sions under study of the 126 samples was 
principle method of measurement used. On t 
basis of small pilot studies, 9-point rating 
were devised to measure the conditions РГ 
in each sample (with the exception of theral 
responsivity which was simply a frequency m 
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ure). Since more than one judge rated each sample 
on a given dimension, reliability or agreement esti- 
mates were available for all scales. The percentage 
of agreement by pairs of judges on a given scale 
ranged from 84 to 96 when tabulations were made 
for "within one point agreement" on the 9-point 
scales, However, since approximately one-third 
of the obtained ratings were given midpoint 
ratings, the relatively high percentage agreements 
might have occurred in part because judges gave 
average ratings to ambiguous samples. In spite 
of this, however, it is felt that relatively adequate 
reliability was attained. 

In all, a total of 13 judges of heterogeneous 
background participated in the present study: 
3 psychiatric social workers, 1 psychiatrist, 5 
psychologists, 2 graduate students in clinical psy- 
chology, and 2 highly trained lay persons. The 
judges formed a heterogeneous sample not only 
in disciplinary training, but also in their orientation 
to psychotherapy: 5 were relatively client centered 
in orientation; 2 were psychoanalytic in orienta- 
tion; 2 were oriented toward psychiatric group 
case work; and the remaining 4 described them- 
selves as eclectic or as having a learning theory 
orientation. Such heterogeneity of judges should 
tend to minimize systematic errors arising from 
theoretical preconceptions. 

The rating scales for conditions were randomly 
assigned to judges except that different judges 
received differing numbers of scales. A single 
judge, by his request, was given from three to 
five scales to rate, It will be remembered that in 
no case did the rater have information about the 
Source of the samples, so that ratings were made 
blind" with respect to the therapist in the sample, 
the patient, and the time in therapy from which 
the sample was drawn. As a further control, 


- Where judges rated several scales, each scale was 


done separately rather than simultaneously. 


Measuring Instruments 


Measurement of Intrapersonal Exploration 


Since the adequacy and validity of the measure- 
Ment of the criterion of intrapersonal exploration 
is crucial for the present investigation of facilita- 
tive therapeutic conditions, three separate measures 
Were used. The three measures selected are con- 
Sidered divergent approaches to the problem of 
measuring intrapersonal exploration: (a) an adap- 
tation of Rogers and Rablen's Process Scale 
(1958 unpublished); (b) an Insight Scale, de- 
Signed to measure the occurrence of new percep- 
tions of relationships between old experiences or 
feelings; and (c) a Personal Reference Scale, 
Measuring the number of personal references per 
words emitted. 


Process Scale. The Process Scale was devised 
to give operational meaning to the client centered 
Conception of constructive personality change 
within psychotherapy sessions. It was designed to 


measure not only the amount of intrapersonal 
exploration, but also the depth or extent to which 
the patient explores himself in psychotherapy. 
Very briefly, process is defined along seven dimen- 
sions involying both cognition and feeling, and, 
moving from a point of fixity, rigidity, and frag- 
mentation to a point of integrated changeness— 
from an external and rigid locus of evaluation 
to an internal and relative locus of evaluation. 
Validation studies have been completed demon- 
strating its relationship to various criteria of 
success in therapy, so that relatively more con- 
fidence is placed in this scale as a measure of 
intrapersonal exploration than in the following 
two scales which lack empirical validation (Hart, 
1958 unpublished; Tomlinson, 1959), In the 
present investigation two judges did independent 
ratings of each of the 126 samples using the 
Process Scale manual. These two raters, who 
did not participate in any other ratings in this 
study, showed a moderate level of agreement on 
the 70-point scale as indicated by a correlation 
coefficient of .64. In the analysis of the data the 
ratings from the two raters were added together 
to form a pooled process rating for each sample, 
The correlation of part-whole for each rater's 
rating with the pooled rating was .84 and .79, in- 
dicating that raters contributed relatively equally 
to the pooled scores. 

The importance of the development of insight, 
or the perception of new relationships between 
old experiences or feelings is emphasized clearly 
in both client centered and psychoanalytic writings. 
Research studies by Seeman (1949) and Blau 
(1953) both underscore the relationship of insight 
to successful individual psychotherapy. For the 
present study an Insight Scale was devised for 
application to group psychotherapy using a 9-point 
rating scale, defined as follows: 

Insight Scale. A low state of insight is when 
group members are simply “telling their story” 
or “catharting” ; when they are perseverating in old, 
“known” feelings. A high state of insight is when 
group members are able to relate previously 
thought-to-be-unrelated feelings or experiences; 
when they experience two or more feelings, etc., 
as related which were unrelated—when the person 
finds a new basis for relating feelings or experi- 
ences. 

Thus the present Insight Scale differentiates 
between patient statements that actively explore 
new relationships within the self from those that 
simply repeat self-related material without explor- 
ing new areas or feelings. Three judges each 
independently rated the 126 samples. Agreement 
within one point between pairs of judges is given 
by the following percentage agreements: .87, .88, 
and .84. The scores for the Insight Scale used 
in the analysis was the mean value for the three 
judges, rounded off to the nearest whole number, 


Personal Reference Scale. The Personal Refer- 
ence Scale was simply the number of personal 
pronouns given by patients per sample divided 
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by the number of words given by patients per 
sample. It was felt that this would provide a 
very crude but objective measure of the self- 
orientation of the patient statements. It is logically 
expected that intrapersonal exploration demands 
self-oriented statements and thus this measure 
would provide some estimate of intrapersonal 
exploration. The Personal Reference Scale, al- 
though being a crude measure, does have the valu- 
able characteristics of being an entirely objective 
measure. 

In summary, the Process Scale was designed as 
an overall measure of the amount and depth or 
extent of intrapersonal exploration, while the 
Insight Scale and the Personal Reference Scale 
were designed to measure specific aspects of intra- 
personal exploration. 


Measurement of Therapist Conditions 


Nine-point scales were devised to estimate the 
degree to which each therapist condition was pres- 
ent within each sample unit. Following the pro- 
cedure used in the Insight Scale, the rating 
dimension was defined for high and low levels of 
conditions, but the exact scaling values were 
defined by the judges. In the case of all rated 
variables the scores used in the analysis were 
the mean values for each sample of the ratings 
by the two or three judges participating for each 
scale, rounded off to the nearest whole number. 
The specific scales were defined as follows: 


Empathic Understanding. A low state of ther- 
apist’s empathic understanding is when he ignores, 
misunderstands, or does not even attempt to sense 
the patient's “private world"; when he evaluates 
the patient, gives advice, sermonizes, etc. A high 
level of empathic understanding is when he is 
experiencing an accurate, empathic understanding 
of the patient's awareness of his own experience; 
to sense the patient's private world as if it were 
his own, but without losing the "as if" quality; 
to sense the patient's anger, fear, or confusion 
as if it were the therapist's own, yet without the 
therapist’s own anger, fear, or confusion getting 
bound up in it. He can communicate his under- 
standing of what is clearly known to the patient 
and can also voice meanings in the patient's ex- 
perience of which the patient is scarcely aware. 
The therapist's remarks fit in just right with the 
patient's mood and content. The therapist knows 
what the patient means. He is able to share the 
patient’s feelings. 

Accurate Empathy. A low level of accurate 
therapist’s empathy is when the therapist is no 
longer’ “with” the patient, but is off on a tangent 
of his own; when he has misinterpreted what the 
patient is feeling or experiencing; when he is 
responding to a feeling that is expected of the 
patient, but is not currently a feeling of the 
patient. At a moderate level of accuracy the 
therapist may be responding to a feeling actually 


present but he has overestimated or underestimated 
its intensity. A high level of accurate empathy 
is when the response catches the exact feeling 
with the exact intensity of affect. He is exactly 
“with” the experiencing of the patient and com- 
municates this. He neither underestimates nor 
overestimates the intensity of feeling and accu- 
rately communicates this. He is responding to 
clarify the feeling that the patient has been, or 
is, struggling with, 


Genuineness or Congruence. A low level of 
therapist’s genuineness is when he presents a 
facade, either knowingly or unknowingly; a de- 
nial of actual feelings so that if his experience 
is “I am afraid of this patient,” he might become 
autocratic or defensive. A high level of therapist 
congruence is when he is freely and deeply himself, 
This includes being himself even in ways which 
are not regarded as ideal for psychotherapy. T his 
does not mean he must overtly express all of his 
feelings—only that he does not deny them; the 
opposite of a facade—that he is genuinely himself 
in the relationship. 


Unconditional Positive Regard. A low level is 
when the therapist evaluates a patient or his feel- 
ings, when he expresses dislike or disapproval for 
a patient or his feelings or experiences. When the 
therapist expresses a selective evaluating attitude— 
“you are bad in these ways, good in those.” A 
high level is when the therapist experiences a 
warm acceptance of each aspect of a patient's 
experience as being part of that person; when 
there are no conditions of acceptance and warmth; 
when there is as much feeling of warmth and 
acceptance for the client’s expression of negative, 
“bad,” or defensive or abnormal feelings as for his 
expression of “good,” positive, and mature feel- 
ings; when the therapist experiences a nonposses- 
sive caring for the patients—as separate persons 
with permission to have their own feelings and 
experiences; a prizing of the patients. 

Leadership. A low level of therapist’s leader- 
ship is when the interaction is primarily between 
patient members of the group—when members 
address their responses to other members, rather 
than to the therapist; when the therapist is little 
more than a participating member; when the 
atmosphere is laissez-faire; when other members 
have assumed the leadership role. A high level of 
therapist leadership is when the interaction 1s 
frequently between the therapist and other mem- 
bers—when members address their statements to 
the therapist, rather than to other members. The 
exercise of the leadership role by the therapist 
includes a very permissive leadership when it is 
clear that the therapist is leading, rather than 
just participating as a member. 

Assumed Similarity. A low state of the thera- 
pist’s assumed similarity of self and member- 
patients is when he sees himself as very different 
from patients; when he assumes that patients, 
their feelings, experiences, or actions are very 


————— ‚а 


PROCESS OF GROUP PSYCHOTHERAPY 11 


different from himself; when he has a conscious 
or unconscious feeling of "I wouldn't do or feel 
that.” A high level of therapist's assumed similarity 
is when he sees patients as essentially people like 
himself; when he appears to feel that many of 
the patients act, feel, and are very much like 
himself—when he feels that under similar con- 
ditions he might feel or act in the way that they 
have felt or acted. 


Responsivity. The final measure of therapist's 
conditions, responsivity, was simply a frequency 
count of the number of therapist responses occur- 
ring in each sample. All responses, including 
interjections such as "Mhm," were counted. 


Measurement of Group Conditions 


As in the measurement of the amount of 
therapist conditions occurring in each of the group 
psychotherapy sample units, the measurement of 
the group conditions involved the use of rating 
scales designed specifically for each of the con- 
ditions under study. The scales and the rating 
procedures were identical in form to those used 
to estimate the therapist conditions. Again, two 
or three judges were used for each scale and the 
Scores used in analysis were mean rating values 
rounded off to the nearest whole number. The 
actual scales were defined for the dimensions 
involved as follows : 


Concreteness or Specificity of Expression. A 
low level of concreteness or specificity is when 
there is a discussion of anonymous generalities; 
when the discussion is on an abstract intellectual 
level. This includes discussions of “real” feelings 
that are expressed on an abstract level. A high 
level of concreteness or specificity is when specific 
feelings and experiences are expressed—"I hated 
my mother!" or ^. . . then he would blow up 
and start throwing things"; when expressions 
deal with specific situations, events, or feelings, 
regardless of emotional content. 


De-individuation. A low state of de-individuation 
of the group is when members respond to others 
in terms of personal individual characteristics ; 
when group members are clearly seen and re- 
sponded to as individuals; when members attend 
to others as persons. A high state of de-individua- 
tion is when group members do not attend to others 
as individuals, but respond in terms of “what” 
was said, rather than “who” said it; when group 
members attend to the content or feeling of a 
statement rather than to “who” it came from. 


Empathic Understanding. This is exactly as 
given under "Therapist Conditions" except that 
the word "group" was substituted for the word 
"therapist." í 


Cooperative Spirit. A low state of cooperative 
and mutually helpful group spirit is when there 
is much interindividual competition; when 
member is competing to be heard; when members 


are exclusively concerned with themselves and 
their problems, so that they frequently do not listen 
to others; when members are attacking one an- 
other. A high state of coperative spirit is when 
members are cooperating and attempting to be 
helpful, even though their "help" may be inept or 
even harmful; when members are sincerely con- 
cerned for the welfare of other members. This 
includes questioning when it is an attempt to help, 
supportive moves, sharing of similar experiences 
when not self-oriented, etc. 


Sociability. A low level of sociability is when 
members are problem oriented, when members 
interact only in terms of the content of discussion, 
when members interact on an intellectual level, 
еїс.; when the quality of the interaction is un- 
friendly or openly hostile. A high level of group 
sociability is when members respond on a social 
basis—when they joke amongst themselves, when 
they talk about dances or social activities, etc.; 
when they discuss social content in a friendly, 
personal manner. 


Genuineness or Congruence. This is exactly as 
given under “Therapist Conditions" except that 
the word "group" was substituted for the word 
“therapist.” 


Group Cohesiveness. A low state of group co- 
hesion is when the group atmosphere is unpleasant 
or unrewarding to the members; when they do not 
like each other, or what is being discussed; when 
there is a lack of any group spirit or unity—when 
the group situation itself is unattractive and the 
members would rather not be there or do not wish 
to participate in the group. A high state of group 
cohesion is when the group is highly attractive to 
jts members. There is a strong group spirit, even 
if it is not a therapeutically positive one. This 
includes the presence of satisfying relationships, 
feelings that members “get something out of it,” 
and a feeling that the discussion is meaningful ; 
when members feel the atmosphere to be pleasant— 
when there are strong attractions to being in the 
group for any reason. 

Ego Involvement in the Discussion. A low state 
of ego involvement is when group members. are 
not interested in what the group is discussing; 
when there is little personal investment, when 
the group is apathetic, superficial, or . frankly 
bored; when members feel that what is being 
discussed is of no consequence to themselves. А 
high level of ego involvement is when members 
feel that they themselves have a "stake" in what 
is said; when they feel that the discussion is 
relevant and of importance to themselves; when 
they enthusiastically enter into the discussion and 
have a personal investment in what is being said. 
This includes being highly interested in the dis- 
cussion even though they regard it as not being 
personally relevant for themselves—as when they 
are personally invested in helping another group 


member, etc. 
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Toward Evaluation of the Method 


At the very least the present investigation 
studied a method for quantification of di- 
mensional properties of interpersonal inter- 
action in the highly complex and perplexing 
setting of group psychotherapy. The method 
involved multiple measurement, by means 
primarily of rating scales, of therapist and 
patient in the naturalistic setting of group 
psychotherapy. А partial proof of the 
validity of the method lies in its success or 
failure in detecting expected phenomena. 
Thus, to the extent that it achieves reason- 
‘able power in detecting relationships, the 
present method offers promise for investiga- 
tion of other interpersonal interactions in 
other contexts. The primary requirement 
for adequate reliability appears to have been 
met, since agreements of judges within one 
point ranged from 84% to 96% for the 
scales devised for the present study. 


OVERALL RELATIONSHIPS BETWEEN 
CONDITIONS AND INTRAPERSONAL 
EXPLORATION 


The primary questions posed by the hy- 
potheses of the present investigation deal 
with the relevance of the therapist condi- 
tions and the group conditions to the criteria 
measures of intrapersonal exploration. To 
attempt an answer to these questions, scales 
were devised to quantify these character- 
istics occurring within the group psycho- 
therapy setting. 

It was suggested earlier that the three 
measures were not considered to be equiva- 
lent measures of intrapersonal exploration. 
Using only a priori considerations it was 
suggested that they could be ordered with 
respect to the amount of "true" measure- 
ment of intrapersonal exploration contained 
in the three scales. The Process Scale was 
considered as the overall measure of amount 
and depth of intrapersonal exploration; the 
Personal Reference Scale was considered 
the most crude and indirect measure. Such 
an ordering is also reflected in the obtained 
variances of the three measures and in the 
intercorrelations between the measures. The 
Process Scale with a mean of 59.25 and a 


standard deviation of 8.89 shows the least 
variability; the Insight Scale has a mean 
of 4.89 and a standard deviation of 1.67; 
the Personal Reference Scale has a mean 
of 92.40 and a standard deviation of 36.82. 
The intercorrelations between the measures 
of intrapersonal exploration are consistent 
with the a priori expectations: the correla- 
tion between the Process Scale and the 
Personal Reference Scale (r — .33) is, by 
the z test, significantly less than that ob- 
tained between the Process Scale and the 
Insight Scale (r — .53), while the correla- 
tion between the Insight Scale and the Per- 
sonal Reference Scale (r — .44) lies be- 
tween these two values. 

In interpreting the results, then, greatest 
weight will be given relations between con- 
ditions and the criterion measure of the 
Process Scale, and, least weight will be 
given to obtained relations between hypothe- 
sized conditions and the criterion measure 
of the Personal Reference Scale. 

It will be remembered that of the total 
16 hypothesized therapeutic conditions, only 
the responsivity of the therapist was an 
objective frequency measure: there were 15 
rating scale measures. Table 1 presents the 
means and standard deviations for the 
measures of conditions. As can be seen, 
the means of all rated variables approach 
the midpoints of the 9-point rating scales 
with desirable variances. 

In the present single regression analyses 
the population used is the total 126 samples 
of group psychotherapy interaction taken 
from three heterogeneous groups. It is as- 
sumed, of course, that these samples, involv- 
ing a total of 44 patients, are representa- 
tive of group psychotherapy interactions 
common to psychotherapeutically oriented 
groups. To the extent that this is true, the 
results may be generalized to group psycho- 
therapy in other settings. 


Hypothesized Therapist Conditions 


The obtained product-moment coefficients 


of correlation between the seven hypothe- 


sized therapist conditions and the three 
measures of intrapersonal exploration ате 
presented in Table 2. Each correlation was 
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based upon the pooled ratings of two or 
more judges for both the condition and the 
criterion measure on 126 samples. As can 
be seen from Table 2, the results tend to 
confirm the therapeutic relevance of all 
hypothesized therapist conditions with the 
single exception of Empathic Understand- 
ing, which, although showing a relationship 
to both the Process Scale and the Insight 
Scale in the predicted direction, falls short 
of the required significance level. The find- 
ing of significant correlations between Ac- 
curate Empathy, Genuineness, and Uncon- 
ditional Positive Regard and both the 
Process Scale and the Insight Scale is 
taken as positive support for the client 
centered theory of psychotherapy. The find- 
ing of significant positive relationships to 
the criterion measures with the scale de- 
signed to measure Accurate Empathy and 
the lack of significant findings with the 
scale designed to measure Empathic Under- 
standing suggests either that: (а) the for- 


TABLE 1 


MEANS AND STANDARD DEVIATIONS OF 


CONDITION MEASURES 
moo — . T nua АКЫНА 


SD 


Condition Measures M 


Therapist Conditions: 
Empathic Understanding 
Accurate Empathy 
Genuineness or Self-Congruence 
Unconditional Positive Regard 
Leadership 
Responsivity 
Assumed Similarity 


О ЕНЕ ДОМУ 
2589589845 
uem edid 
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Group Conditions Controlled 
by Therapist: 


Concreteness or Specificity 4.35 | 1.74 
De-individuation 4.80 | 1.40 
Empathic Understanding 5.05 | 1.49 
Unconditional Positive Regard 4.71 | 1.38 
Cooperative Spirit 4.69 | 1.70 
Sociability 4.75 | 1.49 


Group Conditions Indirectly 
Influenced by Therapist: 
Genuineness or Self-Congruence 


of group members 5.23 | 1.66 
Cohesion Е 5.84 | 1.26 
Ego Involvement 5.35 | 1.79 
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mer scale is simply a more sensitive measure 
of empathy, or (5) that the effective com- 
ponent of empathy is sensitivity to feelings 
in the patient rather than an ability to share 
these feelings. Since the scales were de- 
signed to study precisely this difference in 
concepts of empathy, the present author 
would tentatively entertain {һе second 
interpretation. 

The criterion measure of the rate of per- 
sonal pronoun emission, the Personal Refer- 
ence Scale, shows only relationships to the 
two measures of therapist activity level, 
Leadership and Responsivity. These results 
suggest that therapist leadership in group 
psychotherapy is indeed therapeutically 
relevant. The single therapist condition 
which shows a significant association with 
all three measures of intrapersonal explora- 
tion is Leadership, while Responsivity is 
related to intrapersonal exploration as meas- 
ured either by the Personal Reference Scale 
or the Insight Scale. 

Assumed Similarity shows a significant 
relationship to process and a highly signifi- 
cant relationship to the occurrence of per- 
ception of new relationships between old 
experiences or feelings (Insight Scale). 

The results, then, of the analysis of thera- 
pist conditions tend to confirm the thera- 
peutic relevance of the hypothesized condi- 


TABLE 2 


CORRELATIONS BETWEEN THERAPIST CONDITIONS 
AND INTRAPERSONAL EXPLORATION 


———— 


Per- 
Proc- | In- | sonal 
ist Conditions ess | sight | Refer- 
Copa qid Scale | Scale | ence 
Scale 
ИТУ з ошл S MEE ины сы 
Empathic Understanding | .15 .15 |—.07 
ARA Empathy .34* | .33* | .04 
ii or Congru- 
rcr x i Е .24* .22* .06 
itional Positive 
bci .23* | .18* |-.04 
Leadership .18*| .23* | .21* 
Responsivity .04 .24* | .18* 
‘Assumed Similarity 125" | .41*%| .11 


* Significant at or beyond .05 level. 
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tions. Chance fluctuations in the sampling 
and measurement would account for the 
presence of only one significant correlation 
and that one at only the .05 level. As it 
is, 13 of the 21 correlations are significant 
beyond the .05 level, and, only the Empathic 
Understanding scale fails to show a signifi- 
cant correlation. with at least two of the 
three measures of intrapersonal exploration. 

In order to gain a more exact understand- 
ing of the relationships described by the 
correlations in Table 2, the 126 samples 
were divided into six groups of 21 samples 
each, using the distribution of the Process 
Scale values as a basis of grouping the data. 
The therapist conditions were then plotted 
as a function of mean Process Scale level. 
The resulting curves for the seven hypothe- 
sized therapist conditions were essentially 
linear with the marked exception of the con- 
dition of Self-Congruence or Genuineness 
of the therapist. The plot of the regression 
of Self-Congruence on Process Scale level, 
shown in Figure 1, is essentially nonlinear ; 
quite low values of therapist Genuineness 
occurred in samples with quite low Process 
Scale values, but there is no relationship 
between intermediate and higher values of 
process and the values of therapist Genuine- 
ness. The implication of this finding is quite 
clear: whereas a lack of Genuineness or 
Self-Congruence on the part of the therapist 
is indeed related to a corresponding lack of 
intrapersonal exploration in the patient, 
beyond a minimal level additional Genuine- 
ness or Self-Congruence of the therapist is 
not related to increased intrapersonal ex- 
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ploration by patients. This would suggest 
that the client centered hypothesis concern- 
ing genuineness of the therapist might be 
put in a different form, stating that a lack 
of therapist Genuineness inhibits intraper- 
sonal exploration. That is, while Genuine- 
ness does not facilitate intrapersonal ex- 
ploration, the presence of a conscious or 
unconscious facade inhibits intrapersonal 
exploration. 


Hypothesized Group Conditions under 
Direct Control of Therapist 


The correlation coefficients describing the 
obtained relationships between the hypothe- 
sized conditions and the criteria measures 
of intrapersonal exploration are given in 
Table 3. It will be immediately noted that 
the Concreteness or Specificity of expres- 
sion of feelings or experiences by the group 
shows an extremely high relationship to all 
three measures of intrapersonal exploration ; 
this would indicate that the group condition 
of Concreteness or Specificity is an ex- 
tremely potent therapeutic variable. This 
finding is perhaps somewhat surprising in 
that Concreteness is a dimension of psycho- 
therapy left largely untouched by the major 
theoretic viewpoints. The magnitude of 
the relationship suggests that psychotherapy, 
regardless of the viewpoint of the therapist, 


TABLE 3 


CORRELATIONS BETWEEN Group CONDITIONS 
DIRECTLY UNDER CONTROL OF THE 
THERAPIST AND INTRAPERSONAL 


EXPLORATION 

Per- 

Proc- | In- sonal 

Group Conditions ess sight | Refer- 
Scale | Scale | ence 

Scale 

Concretenessor Specificity] .47* | .63* ut 
De-individuation .09 |—.15 .02 
Empathic Understanding | .25* | .04 .03 

Unconditional Positive 

Regard 11 |—.16 .02 
Cooperative Spirit .19« | .02 |—.07 
Sociability —.2 |-.26* |--11 


* Significant at .05 level. 
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must involve discussion of concrete and 
specific feelings or experiences if it is to be 
effective. One implication is that discus- 
sions of abstract feelings or experiences— 
generalized experiences or feelings—strong- 
ly inhibit patient self-exploration in group 
psychotherapy. 

It may also be seen from Table 3, that 
both De-individuation and Unconditional 
Positive Regard show no reliable relation- 
ship to intrapersonal exploration. Perhaps 
part of the explanation for the lack of effect 
of Unconditional Positive Regard can be 
found in the relationships between the two 
other measures of group “warmth” and 
intrapersonal exploration. Sociability shows 
a negative or inhibitory relationship to self- 
exploration, while Cooperative Spirit shows 
a positive or facilitative relationship. These 
apparent contradictory findings are con- 
sistent with the a priori assumption ad- 
vanced earlier that the three conditions 
aimed at group warmth involve “supportive 
warmth” or “minimization of problems” 
which tend to inhibit further self-explora- 
tion and also “understanding warmth” 
which tends to expect (and hence make 
some demand for) further intrapersonal 
exploration. Certainly such theoretic specu- 
lation is consistent with the negative effect 
of Sociability which does involve supportive 
statements by the group and by the posi- 
tive effect of Cooperative Spirit which 
does involve helping others to understand 
themselves. 

The hypothesized facilitative effect of 
Empathic Understanding by the group is 
supported with respect to the Process Scale, 
but not by the Insight or Personal Refer- 
ence scales. That is, Empathic Understand- 
ing for a patient by the group members is 
related to the amount and depth of intra- 
personal exploration, but not to specific 
aspects such as the development of new 
insight and the use of self-references. This 
lack of relationship to insight development, 
in contrast to the findings with therapist 
Empathy, might suggest that empathic 
understanding by the group members 15 
related to undirected exploration. It might 
well be that the empathy from group mem" 
bers reflects an undirecting «understanding 


warmth,” rather than a more illuminating 
accurate perception of the person’s feelings 
and experiences. 


Hypothesized Group Conditions Indirectly 
Influenced by Therapist 


The correlations between the three group 
conditions which appear to be only under 
indirect control of the therapist and the 
measures of intrapersonal exploration are 
given in Table 4. It will be immediately 
noted that all three conditions show rela- 
tively high correlations with the criterion 
measures. These results would suggest that 
all three conditions are indeed relatively 
potent variables in group psychotherapy. 
Genuineness or Self-Congruence of the 
group members shows a relatively high 
association with all three measures of intra- 
personal exploration. In contrast to the 
nonlinear regression of therapist Genuine- 
ness, the regression of Congruence or Gen- 
uineness of the group, as well as Cohesive- 
ness and Ego Involvement, is essentially 
linear when plotted as a function of amount 
of intrapersonal exploration. In the case of 
Genuineness or Self-Congruence, it might 
well be argued that this measure lies along 
a dimension of psychological disturbance- 
adjustment. From an examination of those 
samples rated high and low on this condi- 
tion, however, it appears that even though 


TABLE 4 


CORRELATIONS BETWEEN Group CONDITIONS IN- 


INFLUENCED BY THE THERAPIST AND 


DIRECTLY 
INTRAPERSONAL EXPLORATION 
Per- 
Proc- | In- о 
Conditions ess sight | Refer- 
I AES Scale | Scale | ence 
Scale 
RINT LAU dee ie ae | 
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nce of group 
ЩИ sse | он ш» 
Group Cohesion ; .38* | .18* 04 
Ego Involvement 0 
group members .42* 21* 11 
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the material discussed by patients in a rela- 
tively high state of congruence may be 
psychotic or indicative of gross disturbance, 
the patients are less defensive and more able 
to share with other group members their 
feelings or experiences. 

This finding (that the greater the Genu- 
ineness of the group members in the psycho- 
therapy relationship, the greater the amount 
and depth of intrapersonal exploration, the 
development of insight, and the rate of 
personal references) seems a direct contra- 
diction of one aspect of client centered 
theory. Rogers (1957) has specifically 
stated the necessity of the client's being 
incongruent for successful psychotherapy. 

From a careful reading of Rogers' dis- 
cussion of the requirement of incongruence 
on the part of the patient, it would appear 
that incongruence by the patient is conceived 
of as the source of anxiety. The presence 
of anxiety in the patient, rather than the 
maintenance of a facade, appears to be the 
meaning of incongruence when applied to 
the patient. That is, the functional role of 
a lack of self-congruence in client centered 
theory is an explanation of the development 
of anxiety. As pointed out earlier, however, 
self-congruence appears to be essentially a 
lack of defensiveness, rather than a lack of 
anxiety. Thus a revision of this aspect of 
client centered theory suggested by the pres- 
ent results might state the necessity of both 
vulnerability to anxiety (as the motivation 
for personality change) and self-congruence 
within the psychotherapy relationship (to 
allow for intrapersonal exploration). Under 
such a revised theoretic interpretation one 
of two possibilities emerges: (a) the source 
of vulnerability to anxiety does not lie in 
the lack of self-congruence, or, (b) a person 
who is normally incongruent in daily living 
(due to perceived threat) is able to become 
self-congruent within the psychotherapy re- 
lationship (due to the absence of threat 
created by the acceptance and warmth of 
the therapist). For the present, the author 
will entertain the latter hypothesis. In any 
event, the results do strongly suggest that 
the presence of a facade either conscious or 
unconscious, by the patient or the therapist, 


operates to inhibit the process of group 
psychotherapy. 

The second condition, group Cohesion, 
shows significant positive association with 
both Process Scale level and the Insight 
Scale. These results indicate that cohesion, 
long a central concept in the analysis of 
small group behavior, is also of importance 
in the analysis of group psychotherapy : suc- 
cessful group psychotherapy groups are 
cohesive. This may be a somewhat circular 
finding in that at least part of the attraction 
of the group may be due to its success in 
helping its members. However, these find- 
ings not only suggest the fruitfulness of 
applying knowledge of attitude change ob- 
tained from studies of experimental groups, 
but, also point to a variable unique to the 
group setting and one which is susceptible 
to external manipulation. 

Also, the finding that Ego Involvement 
of the group discussion is highly related to 
intrapersonal exploration might be open to 
a similar circular argument. Again, the 
obtained correlations are of both theoretic 
and practical significance, in that at the very 
least ego involvement is a measurable com- 
ponent of effective group psychotherapy. 

The results of the regression analyses 
thus far presented, then, have supported the 
relevance of 13 of the 16 hypothesized ther- 
apeutic conditions to patient self-explora- 
tion. The results have also pointed to the 
importance of several conditions not ех- 
plicity dealt with thus far by current 
theories of psychotherapy or by previous 
research. Finally, the findings have sug- 
gested the relative magnitudes of the 
hypothesized conditions’ effects upon intra- 
personal exploration: the two most potent 
conditions, Concreteness or Specificity of 
discussion and Self-Congruence or Genu- 
ineness of the group members, have been 
both explicitly hypothesized and investigated 
for the first time. 


WITHIN Group RELATIONSHIPS BETWEEN 
CONDITIONS AND INTRAPERSONAL 
EXPLORATION 


The relationships reported above between 
the hypothesized conditions and the meas- 


PROCESS OF GROUP PSYCHOTHERAPY 


ures of intrapersonal exploration were 
obtained from the total 126 samples. Tt will 
be remembered that the three groups from 
which these samples were drawn (42 from 
each) were purposefully heterogeneous with 
respect to the patient population and thera- 
pist orientation. Groups A and C were 
composed of female patients while Group В 
was an all male group. It would therefore 
be expected that the three groups might 
differ on several of the therapist conditions 
and group conditions under investigation. 
The obtained means and standard deviations 
for each of the groups separately are given 
for each of the conditions measured in 
Table 5. 

Tt would be desirable to analyze the rela- 
tionships between the conditions and intra- 
personal exploration within each of the 
three groups. It may be noted that the find- 
ing of an overall positive relationship be- 
tween, for example, Accurate Empathy and 
the Process Scale does not mean that this 
relationship will necessarily be positive with- 
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in each of the three groups. Depending 
upon the array means within each group, it 
is possible to have an overall positive rela- 
tionship and strong negative relationships 
within each of the three subsamples (Kemp- 
thorne, 1952; Rao, 1952). 


Process Scale Relationships 


The correlation coefficients between the 
hypothesized therapeutic conditions and the 
amount and depth or extent of intrapersonal 
exploration as measured by the Process 
Scale for each group separately are pre- 
sented in Table 6. It can be seen, then, that 
in general the correlations obtained for each 
group separately are essentially identical (in 
fact the averages of the three are slightly 
higher) to those obtained using the whole 
sample (Tables 2, 3, and 4). Also the dif- 
ferences between the correlations from the 
three groups are not significantly different 
from each other except in the case of the 
Responsivity meastre. Using the 2 trans- 


TABLE 5 
MEANS AND STANDARD DEVIATIONS OF CONDITION MEASURES FOR THE THREE GROUPS 
Condition Measures Group A Group B Group C 
NEM v 0 X EQ si A adi 
M SD M SD M SD 
Therapist Conditions: 5.98 1.52 siae so 
Empathic Understanding 4.74 1.74 А я MUS AS 
Accurate Empathy 4.64 | 1.81 2 И L8 puel 
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TABLE 6 


CORRELATIONS BETWEEN HYPOTHESIZED CONDITIONS 
AND THE Process SCALE FOR EACH GROUP 


SEPARATELY 
Group | Group | Group 
Hypothesized Conditions A B G 
Therapist Conditions: 
Empathic Understanding| .14 .16 .38* 
Accurate Empathy .25 .40* | .60* 
Genuineness or Self- 
Congruence .26 MS ELI 
Unconditional Positive 
Regard .38* .22 .33* 
Leadership .24 .22 .31* 
Responsivity —.12 .36* | .47* 
Assumed Similarity .27 .39* | .36* 
Group Conditions Con- 
trolled by Therapist: 
Concreteness .53* | .50*| .50* 
De-individuation .11 |—.08 15 
Empathic Understanding] .20 22 .33* 
Unconditional Positive 
Regard 15 .02 |—.12 
Cooperative Spirit .23 14 24 
Sociability —.27 |-.28 |—.11 
Group Conditions Indi- 
rectly Influenced by 
Therapist: 
Genuineness or Self- 
Congruence .53* | .63* | .52* 
Group Cohesion .43* 12 .41* 
Ego Involvement 43% .47* .33* 


* Significant at or beyond the .05 level of confidence. 


formation to test for differences between 
correlations, the relationship between Re- 
sponsivity of the therapist and intrapersonal 
exploration for Group A differs significantly 
from those obtained for both Groups B and 
C. As can be seen there is a nonsignificant 
tendency for this relationship to be nega- 
tive for Group A, while for Groups B and 
C there are significant positive relationships 
between Responsivity and intrapersonal ex- 
ploration. Further, since Groups A and C 
are the two extremes in average level of 
therapist Responsivity these findings are 
probably unrelated to the average absolute 
responsivity of the therapist. 

It will be remembered that the prediction 
of a positive relationship between therapist 
Responsivity and the criterion of intraper- 


sonal exploration was based upon the as- 
sumption that such responses would be high 
in Empathic Understanding, Unconditional 
Positive Regard, and other hypothesized 
therapeutic conditions. As can be seen in 
Table 5, the therapist in Group A tends on 
the average to communicate less Empathic 
Understanding, Accurate Empathy, and Un- 
conditional Positive Regard than Groups 
B and C. That is, a positive relation be- 
tween therapist Responsivity and intraper- 
sonal exploration is obtained in the two 
groups where the average therapist values 
for understanding and warmth are highest, 
and a nonsignificant negative relation is 
obtained in the group where the average 
therapist values for these characteristics of 
the therapist are lowest. 


A second and possibly better explanation 1 


of the difference in correlation of Respon- 
sivity with intrapersonal exploration be- 
tween Group A and Groups B and C might 
lie in the relationship between Responsivity 
and other positive therapist characteristics 
for each of the three therapists. These cor- 
relations are presented in Table 7. As can 
be seen, there is a strong general tendency 
for Therapist A to become less therapeu- 
tically positive in his responses when he 
responds more frequently, while there is a 
strong general tendency for Therapists B 
and C to become more therapeutically posi- 
tive when they respond more frequently. 
Using the z transformation to test for dif- 


TABLE 7 


CORRELATIONS BETWEEN THERAPIST RESPONSIVITY 
AND OTHER THERAPIST 


CHARACTERISTICS 
Thera- | Thera- | Thera- 
Therapist pist pist pist 
Characteristics А B c 


Empathic Understanding| —.31* | .05 16 


Accurate Empathy —.12 .22 .64* 
Unconditional Positive 

Regard =.16 | .03 | .27 
Self-Congruence or 

Genuineness —.07 12 .49* 
Assumed Similarity —.15 21 .56* 


* Significant at or beyond the .05 level of confidence. 
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ferences between therapists, the correlations 
for Therapist A differ significantly from 
Therapist C on all five characteristics in 
Table 7 ; there are no significant differences 
between these correlations for Therapist B. 
and Therapist C. 

The results of the above analysis might 
suggest that Responsivity interacts with 
other positive characteristics of the thera- 
pist’s responses to either facilitate or inhibit 
intrapersonal exploration in the patient. 
Whether this interaction is with the mean 
values for these other characteristics or with 
the correlation of Responsivity to these 
characteristics cannot be determined from 
the available data. 


Insight Scale Relationships 


The correlations between the development 
of perceptions of new relationships between 
old feelings and experiences and the hypoth- 
esized conditions for each group separately 
are presented in Table 8. In general these 
correlations are very similar to those with 
the Process Scale presented in Table 6, with 
the exception that Empathic Understanding 
of the group members is negatively related 
to insight development for Groups A and B 
but not for Group C. From an examination 
of Table 5 it could be speculated that this 
difference may be related to the tendency 
for Group C to be more socially oriented, 
although the findings of the present study 
do not give evidence on this point. 

The results of the analysis, then, of asso- 
ciations between the hypothesized therapeu- 
tic conditions and the two measures of intra- 
personal exploration for each of the three 
groups separately are in close agreement 
With the results obtained using the total 
sample, and thus provide further support 
for the hypotheses under study. Further, 
using both the Process Scale and the Insight 
Scale as criteria measures, the original 
hypotheses concerning the relationship be- 
tween therapist Responsivity and intraper- 
Sonal exploration were supported. Of the 
16 original hypothesized therapeutic condi- 
tions only 2 have been unsupported by the 
data (group De-individuation and Uncon- 
ditional Positive Regard of the group for 


its members). All 7 of the hypothesized 
therapeutic characteristics of the therapist, 
and 7 of the 9 hypothesized therapeutic 
group characteristics were found to be sig- 
nificantly related to the criteria measures of 
intrapersonal exploration. 


INDEPENDENCE OF HyPOTHESIZED 
THERAPEUTIC CONDITIONS 


One of the central problems facing the 
scientific investigation of psychotherapy— 
indeed, personality and social psychology as 
well—is that of overlapping concepts. Both 
psychologists and psychoanalysts attempting 
to describe the complex phenomena collec- 
tively termed “psychotherapy” have in- 
vented and ascribed scores of names and 


TABLE 8 


CoRRELATIONS BETWEEN HYPOTHESIZED CONDI- 
TIONS AND THE INSIGHT SCALE FOR EACH 
GROUP SEPARATELY 


Group | Group | Group 
Hypothesized Conditions | А B [ej 
Therapist Conditions: 
Empathic Understanding) — 12 22 .32* 
Accurate Empathy ^9. .62* | .49* 
Genuineness or Self- 
Congruence .16 221) .42* 
Unconditional Positive 
Regard .09 47* | .28 
Leadership 413 AT .43* 
Responsivity —.06 .28 .49* 
Assumed Similarity .21 .48* | .56* 
Group Conditions Con- 
trolled by Therapist: 
Сокмо .56* | .70* | .64* 
De-individuation 03 |-.15 |—.19 
Empathic Understanding|— .32* |— 378 | .14 


itional Positive 
emi Aio | 00 |—.13 
Cooperative Spirit .08 |—.01 |—.19 
Sociability —.23 


Group Conditions Indi- 
rectly Influenced by 
Therapist: a 

> r Self- 

Gr Congruence ate | 63r | ao 

Group Cohesion .28 .06 202. 

Ego Involvement .21 .39* | .28 


* Significant at or beyond the .05 level of confidence, 
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reams of description to a single phenome- 
non. The problem presented to the investi- 
gation of psychotherapy is one of estimating 
the minimum number of variables, dimen- 
sions, or concepts that will account for the 
maximum variance in the criteria of out- 
comes in psychotherapy. The present author 
has investigated 16 “dimensions” or char- 
acteristics of group psychotherapy with 
results that suggest that 14 of these thera- 
peutic conditions are indeed associated with 
the criterion of intrapersonal exploration. 
Still, it is very unlikely that all 14 of the 
effective variables are independent of each 
other. 

The approach chosen by the present in- 
vestigator to this central problem of deter- 
mining independence of conditions in their 
effects on psychotherapy is the use of analy- 
sis of variance of multiple regression 
(Kempthorne, 1952; Rao, 1952). 

Multiple regression theory or, in words 
perhaps more familiar, the theory of the 
general linear hypothesis is the basis for 
most parametric analyses of data. The basic 
assumption, of course, is that observations 
(say, Process Scale values) are expressible 
as linear functions of some known variables 
X, +... + X, (say, hypothesized condi- 
tions), with residual errors which are nor- 
mally and independently distributed around 
zero with constant variance. The model is 
then: 

Y; =A +B,X, +B, X, ...-- By Xy t ei 

The problem is to estimate the constants, 
Bs, which will satisfy the equation. The 
method of least squares is used to esti- 
mate these values in the set of p simul- 
taneous equations. Once these regression 
coefficients are derived, the sum of squares 
of deviations about the estimated coefficients 
can be computed. From this the significance 
of a regression coefficient can be evaluated 
by the usual variance ratio. 

In the present analysis all conditions with 
the single exception of group De-individua- 
tion, which showed no reliable association 
with the criteria, were used as the con- 
comitant variables, and alternatively the 
Process Scale, the Insight Scale, and the 
Personal Reference Scale as the dependent 


variable. In each analysis, then, the cri- = 
terion measure of intrapersonal exploration 
can be expressed as a linear function of the 
condition variables, and, each condition can | 
be tested for the significance of its effect in 
the resulting equation. Using this proce- \ 
dure, the condition with the highest single 
significant correlation to the criteria will be 
given the largest weight, and, then the test 
will be of any additional contribution of the 
succeeding variables. This procedure will, 
then, give us the minimum number of in- 
dependent conditions necessary to account 
for the variability of our criteria within the 
confines of the present research; it permits 
the analyzing of variation into component 
parts. 

Since it has been shown that the correla- 
tions obtained using the total sample are 
essentially identical to those obtained sepa- 
rately from the subsamples of the three 
groups, it would be expected that either 
procedure would yield similar results for 
multiple regression analysis. As the interest 
of the present research is in a population 
of group psychotherapy interactions, the 
total of 126 samples will be used for 
analysis. Р 


Soures of Variation in Process Scale 


The multiple correlation of the 15 hy- 3 
pothesized therapeutic conditions and the 
Process Scale for the 126 samples is .75. 
That is, 53% of the variance in process “3 
accounted for by the conditions under study 1 
in the present research. Table 9 presents - 
the results of the analysis of variance of 
multiple regression of conditions on process: ү 
For convenience, the tests of significance for - 
several conditions simultaneously are pre- 
sented in the tables, although such tests were 
made separately from those of conditions 
individually. 1 
As can be seen from Table 9, many condi- 
tions related to Process Scale values, such 
as Accurate Empathy, do not significantly 
add to the regression equation when 
variables are included in the analysis. In 
all, seven of the hypothesized conditions 
account for significant amounts of separate ` 
sources of variance in Process Scale values: 


| 


1 


vore TREE gm nmm 


А 


* Total 
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(а) therapist Leadership, (b) therapist 
Self-Congruence or Genuineness, (c) group 
Concreteness or Specificity of content dis- 
cussion, (d) group Empathy, (e) Genuine- 
ness or Self-Congruence of group members, 
(f) group Ego Involvement, and (g) group 
Cohesion. That is, each of these seven con- 
ditions accounts for separate and independ- 
ent components of the amount and depth 
of intrapersonal exploration as measured by 
the Process Scale. 

Within the confines of the present study 
some estimate of the relative importance of 
the conditions can be made, but since these 
cannot be estimated in general (because the 
estimates depend upon which and how many 
conditions are included in the regression 
equation) it would seem desirable to rely 
upon the results of the single regression 
analyses reported earlier. 

It must be pointed out that the particular 
seven conditions accounting for significant 


and separate sources of variance of process 
are due in large part to the interrelation- 
ships obtained between the hypothesized 
conditions, so that the exact labeling of the 
seven sources of variation in process might 
change under relatively mild sampling fluc- 
tuations. Thus it might be that in another 
sample group Cooperative Spirit would re- 
place group Empathic Understanding since 
Empathic Understanding is only slightly 
more highly related to process than is Co- 
operative Spirit (correlations of .25 and .19, 
respectively) and both are relatively highly 
related to each other (correlation of .41). 

The relationships between the conditions 
which did not significantly enter into the 
multiple regression on process and the seven 
variables which did are presented in Table 
10. By referring to Tables 2, 3, and 4, it 
can be seen that such variables as Accurate 
Empathy by the therapist (which showed 
significant correlation with process) do not 


TABLE 9 


ANALYSIS OF VARIANCE OF MULTIPLE 


Source 
Cie ш ы 


АП Conditions Simultaneously 


Therapist Conditions Simultaneously: 
Accurate Empathy 
Assumed Similarity 
Leadership 
Genuineness-Congruence 
Unconditional Positive Regard 
Responsivity 
Empathic Understanding 


Group-Therapist Conditions Simultaneously: 
Concreteness 
Empathy 
Cooperative Spirit 
Sociability 
Unconditional Positive Regard 
Group Conditions Simultaneously: 
Genuineness 
Ego Involvement 
Cohesiveness 


Residual Error 


"Significant at or beyond the .05 level of confidences 


REGRESSION OF CONDITIONS ON PROCESS SCALE 
df SS F B Weight 
(15) 5,211.19 8.154* 
(7) 605.80 2.031 
1 98.08 2.302 .64 
1 49.40 1.16 -.57 
1 168.08 3.95* ‚79 
1 174.17 4.09* .90 
1 1.27 .03 .07 
1 60.82 1.43 —.29 
1 62.53 1.47 —.49 
5 815.93 3.83* 
ч) 318.22 7.47* 1.22 
1 357.22 8.38* 1.41 
1 54.82 1.29 .48 
1 7.48 18 —.22 
1 4.57 11 —.16 
1,093.19 8.55* 
e. ' 287.83 6.76* 1.23 
1 222.79 5,23* .90 
1 169.75 3.96* 1.07 
110 4,686.67 
125 


CHARLES В. TRUAX 


T | 
TABLE 10 : 
INTERCORRELATIONS BETWEEN SELECTED CONDITIONS 
Con- 
Ther- | Ther- |creteness| Group | Group Ego Group 
Conditions apist apist and Em- Con- |Involve-| Co- 
Leader-| Con- | Speci- | pathy |gruence| ment hesion 
ship |gruence| ficity 
Therapist: 
Accurate Empathy .23* .19* .29* 17 .33* .09 .09 
Assumed Similarity .18* .28* .33* .08 .37* .20* .08 
Unconditional Positive Regard .01 .10 .23* .08 .20* .21* ‚13 
Responsivity .50* .25* .28* |—.23* .09 |—.04 |—.16 
Leadership .26* .27* |—.33* .18* |—.11 —.28* 
Congruence .19* |—.09 1 .08 .12 
Group: 
Cooperative Spirit —.31* |—.04 |—.09 .41* .08 .07 .34* 
Sociability —.45* |—.06  |—.38* .40* |—.25* |—.11 .19*5 Я 
Unconditional Positive Regard —.05 —.18* |—.03 .19* .13 .07 24 | 
Concreteness —.10 .54* .35* 10 
Empathy .01 .13 .21* 
Genuineness .39* .27* 
Ego Involvement .32€ 7 


*Significant at or beyond the .05 level of confidence. 


add to the multiple regression equation 
because in this sample its effect is already 
accounted for by the more potent conditions 
of: Self-Congruence or Genuineness of the 
group members, and Concreteness of dis- 
cussion. 

The finding that five of the seven condi- 
tions discussed above are group conditions 
appears to suggest that the group atmos- 
phere is perhaps more potent in its effects 
on intrapersonal exploration than is the 
therapist. Further, it might be assumed that 
the formation of these effective group char- 
acteristics is the primary function of the 
therapist, and as can be seen from Table 10, 
that Accurate Empathy, Assumed Simi- 
larity, and Unconditional Positive Regard 
are primarily effective in group psycho- 
therapy because they facilitate such group 
conditions. 

Regardless of theoretic interpretation, the 
results do indicate that the above seven con- 
ditions must be at least considered as pa- 
rameters of effective group psychotherapy 
defined by the amount and depth of intra- 
personal exploration. 


Sources of Variation in Insight Scale 


Using the Insight Scale as the dependent 
variable, the analysis of variance of multiple 
regression yielded the results presented in 
Table 11. The obtained multiple correlation 
coefficient was .71, which indicates that 1 
50% of the total variation in the measured 
development of insight in patient members 
can be accounted for by the hypothesized 
conditions. Again, the magnitude of such 
a finding considering the inherent unrelia- - 
bility of the phenomena studied and the lack 
of precision measuring instruments suggests 
the central therapeutic relevance of the 
hypothesized conditions. | 

As can be seen from an inspection of ! 
Table 11, only one of the hypothesized | 
therapeutic conditions under study (Con- и 
creteness or Specificity of the group discus- 
sion) contributes a significant source of 
variation in patient insight development. 
That is, only 1 of the 15 conditions here 
studied can be considered as necessary for 
the development of insight; no other con- , 
dition contributes a significant additiona 
source of variance. Examination of Tables 
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2, 3, 4, and 10, together indicate that the 
relationship between the hypothesized con- 
ditions (such as Accurate Empathy) and 
the development of insight could be entirely 
explained by these conditions' association 
with Concreteness of discussion. These 
results clearly point to the overwhelming 
importance of Concreteness in effective psy- 
chotherapy once the importance of the 
development of insight is admitted. Cer- 
tainly psychoanalytic and client centered 
theorists as well as most learning theorists 
conceive of the perception of new relations 
between old feelings or experiences as a 
necessary, although not sufficient, antecedent 
for effective psychotherapy. 


Sources of Variation in Personal 
Reference Scale 


In the analysis using the Personal Refer- 
ence Scale as the dependent variable, pre- 


sented in Table 12, two conditions signifi- 
cantly accounted for separate sources of 
variance; the Concreteness or Specificity of 
group discussion, and the Genuineness or 
Self-Congruence of members of the group. 
That is, once the degree to which the group 
is discussing concrete or specific feelings or 
experiences in an open and genuine manner 
is known, then knowledge of the other 
hypothesized therapeutic conditions does not 
substantially add to our prediction of Per- 
sonal Reference Scale values. 

The accuracy of the prediction of the rate 
of personal pronoun emission by the patient 
group members may be judged by the 
obtained multiple correlation coefficient of 
161 which accounts for over one-third of 
the total variation in the Personal Reference 
Scale. 

Viewing the results of the multiple re- 
gression analyses from a slightly different 
vantage suggests that the scales designed 


TABLE 11 


INSIGHT SCALE ANALYSIS OF VARIANCE OF MULTIPLE REGRESSION 


Source 


All Conditions Simultaneously 


Therapist Conditions Simultaneously: 
Accurate Empathy 
Assumed Similarity 
Leadership 
Congruence 
Unconditional Positive Regard 
Responsivity 
Empathy 


Group-Therapist Conditions Simultaneously: 

Concreteness 

Group Empathy 

Cooperative Spirit 

Sociability 

Group Unconditional Positive Regard 
Group Conditions Simultaneously: 

Group Genuineness 


Group Ego Involvement 
Group Cohesion 


Residual Error 


Total 


* Significant at or beyond the .05l evel of confidence. 


df SS F B Weight 
(15) 174.85 7.2525 
7 13.42 1.193 
A .91 0.571 0.06 
1 5.83 3.63 0.19 
1 .01 .01 0.01 
1 .88 .55 0.06 
1 1.47 .91 —0.07 
1 .19 12 0.02 
1 23 14 —0.03 
57.48 T.15* 
о 49.30 30.67* 0.48 
1 2.67 1.66 0.12 
1 .29 .18 .03 
1 1.11 .69 —.08 
1 14 .09 —.03 
7.13 1.48 
0 4.16 2.59 15 
1 2.17 1.34 —.08 
1 1.72 1.07 11 
110 176.80 
125 
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TABLE 12 р 
PERSONAL REFERENCE SCALE ANALYSIS OF VARIANCE OF MULTIPLE REGRESSION 
Source df SS F B Weight _ 
All Conditions Simultaneously (15) 63,315.03 4.37* 
Therapist Conditions Simultaneously: (7) 13,784.23 2.04 
Accurate Empathy 1 2,129.35 2.21 —2.98 
Assumed Similarity 1 113.25 1 —.86 
Leadership 1 2,046.03 2.11 2.15 
Congruence 1 51.92 .05 .49 
Unconditional Positive Regard 1 258.24 .27 —1.01 
Responsivity 1 252.78 .26 .60 
Empathy 1 3,449.70 9:57. —3.611 
Group-Therapist Conditions Simultaneously: (5) 23,416.10 4.85* 
Concreteness 1 17,806.06 18.44* 9.11 
Group Empathy 1 3,432.84 3.56 4.35 
Cooperative Spirit 1 1,829.68 1.89 —2.71 
Sociability 1 3,008.29 3.11 4.37 
Group Unconditional Positive Regard 1 90.62 .09 .10 
Group Conditions Simultaneously: (3) 10,991.42 3.79% 
Group Genuineness 1 10,956.04 11.35* 7.56 
Group Ego Involvement 1 750.44 ain —1.65 
Group Cohesion 1 122.17 #13; —.95 
Residual Error 110 106,197.33 
Total 125 


* Significant at or beyond the .05 level of confidence. 


to measure aspects of intrapersonal explora- 
tion differ in complexity—as indeed an 
examination of the scales themselves sug- 
gest. The Process Scale was designed to 
measure the quantity and depth or extent 
of intrapersonal exploration ; the results in- 
dicate that at least seven differing sources 
of variance were necessary to account for 
Process Scale values. By contrast, the In- 
sight Scale and the Personal Reference 
Scale were designed to deal with amounts 
of specific aspects of intrapersonal explora- 
tion; the results indicate that only one and 
two differing sources of variance were ac- 
counted for by the hypothesized conditions. 
It might be emphasized that the multiple 
regression equations obtained accounted for 
over one-half of all the variation in both 
process and insight in patient group mem- 
bers, and over one-third of all the variation 
in the frequency of personal pronouns per 
unit of speech of patient group members. 


k 
The magnitude of these findings in the 
complex area of psychotherapy, where even 
small relationships are difficult to find, sug> 
gests the fruitfulness of exploring these 
hypothesized conditions, and, simultani 
ously, the need for further research aimed 
at distilling the concepts involved in the 
hypothesized conditions. 


INDEPENDENCE OF THERAPIST CONDITIONS | 
Speciriep Bv CLIENT CENTERED THEORY 


In view of the specific theoretical state- 
ments of the necessary and sufficient co 
ditions for psychotherapy made by clie 
centered theory, and the available research 
supporting it, it would seem desirable to 
investigate the independence of these thera 
pist conditions (Empathy, Genuineness, 
Unconditional Positive Regard). That 
what is the minimum number of such va 
ables necessary to maximally account 
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the variation in process level. Under the 
present client centered formulation, all three 
therapist conditions are thought to account 
for separate sources of variation in Process 
Scale level. This is the meaning of "neces- 
sary” conditions. In view of the high 
intercorrelations obtained in earlier research 
investigating the relationships of these 
conditions to successful therapy, however, it 
might be expected that one or more of these 
conditions do not add to the accounted for 
variation in process when other conditions 
are present. 

The independence of these conditions in 
their effect upon intrapersonal exploration 
can be conveniently investigated by the use 
of analysis of variance of multiple regres- 
sion as described earlier. 

The result of the analysis of variance of 
multiple regression of the therapist condi- 
tions of Empathy, Genuineness, Uncondi- 
tional Positive Regard, and Accurate Em- 
pathy on the Process Scale values for the 
126 samples is given in Table 13. Together 
these conditions yield a multiple correlation 
of .39 and thus account for 1596 of the 
variation in Process Scale level. An inspec- 
tion of Table 13 clearly indicates that only 
two of the three therapist conditions speci- 
fied by client centered theory account for 
independent sources of variance in Process 
Scale values: therapist Genuineness or Self- 
Congruence, and Accurate Empathy. That 
is, although it is clear from the earlier 
analysis that Unconditional Positive Regard 
is related to Process Scale values, it does 
not account for additional variance beyond 
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that already accounted for by Accurate 
Empathy and Genuineness of the therapist. 
This, of course, might suggest that if an 
effective level of Self-Congruence or Gen- 
uineness and of Accurate Empathy is com- 
municated by the therapist in psychotherapy, 
then Unconditional Positive Regard is also 
communicated. An examination of the in- 
tercorrelations between these therapist con- 
ditions adds some clarity to the findings. 
Unconditional Positive Regard is not sig- 
nificantly related to therapist Genuineness 
(as indicated by a correlation of .11) but is 
highly associated with Accurate Empathy 
(as indicated by a correlation of .57). 

A multiple regression analysis was also 
carried out using the Insight Scale as the 
predicted variable, with essentially similar 
results. Unconditional Positive Regard ac- 
counted for по significant independent 
source of variance, while Accurate Em- 
pathy (with an F ratio of 8.99) and Genu- 
ineness (with an F ratio of 3.98) both 
accounted for significant and independent 
sources of variation in the development of 
the perception of new relationships between 
old feelings or experiences. The multiple 
correlation coefficient for insight was 36, 
accounting for 13% of the variance in the 
Insight Scale. These almost identical re- 
sults, using both general and specific meas- 
ures of intrapersonal exploration, add addi- 
tional support to the findings obtained using 
the Process Scale. 

The following, then, can be concluded 
from the above analysis: (a) Accurate 
Empathy and Unconditional Positive Re- 


TABLE 13 


ANALYSIS OF VARIANCE OF MULTIPLE REG 
THERAPIST CONDITIONS ON Process SCALE 


Source 
feos | OS Un nd 


Empathic Understanding 
enuineness or Self-Congruence 
Unconditional Positive Regard 
Accurate Empathy 
Conditions Simultaneously 
Residual Error 


* Significant at or beyond the .05 level. 
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gard by the therapist are each individually 
related to process, but Accurate Empathy 
is more highly related to process than is 
Unconditional Positive Regard; (b) Accu- 
rate Empathy and Unconditional Positive 
Regard are highly related phenomena in 
psychotherapy, so that the measurement of 
Accurate Empathy includes the effective 
variance of Unconditional Positive Regard ; 
and (c) Accurate Empathy by the therapist 
and Genuineness of the therapist in the rela- 
tionship account for separate components 
of the variation in intrapersonal exploration. 
It might be speculated that unconditional 
positive regard for another person is a pre- 
condition for the development of accurate 
and deep understanding of such other per- 
son. Such a viewpoint would regard the 
therapist as bringing two separate and in- 
dependent personal or attitudinal character- 
istics to the psychotherapy relationship: a 
warm understanding of the patient, and 
an honest, openness to experiencing. In any 
event, the above findings suggest a modifica- 
tion in the current client centered theory of 
psychotherapy: of the three hypothesized 
necessary therapist conditions, Genuineness 
and Empathy alone appear sufficient. 


TOWARD EVALUATION OF THE 
PRESENT RESEARCH 


Problem of Causation 


The results, as a whole, tend to support 
14 of the 16 original hypotheses of condi- 
tions facilitative of intrapersonal explora- 
tion in group psychotherapy, and suggest 
that at least 7 different conditions are neces- 
sary to account for intrapersonal explora- 
tion. However, the hypotheses themselves 
are essentially causal hypotheses of ante- 
cedent conditions for effective group psy- 
chotherapy, while the research itself is not 
demonstrably an investigation of causal 
relationships. Both the measures of condi- 
tions and the measures of intrapersonal ex- 
ploration were derived from the same 
samples of group psychotherapy, so that in 
the terminology of experimentation both 
sets of measures were dependent variables. 
For example, the empirical relationship 
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Possibility of Systematic Errors 


The possibility of the presence of sys- 
tematic errors must be considered. Certainly 
the use of rating scale procedures tends to 


between therapist's self-congruence and © 
patient intrapersonal exploration obtained 
in the present study does not demonstrate 
causation. It might be argued that although 
this finding is consistent with the a priori 
theoretical prediction based upon a causal 
hypothesis, the finding itself does not neces- 
sarily support the theory. It might be 
argued that high self-congruence of the 
therapist is facilitated by patient intraper- 
sonal exploration, or that high levels of both 
therapist self-congruence and patient self- 
exploration are caused by the action of a 
third variable. Thus, causal inferences may 
be made only on the basis of logic external 
to the present research. 

Some plausibility to the present causal 
hypotheses is given, however, by the Ends 
and Page study (1957) reported earlier 
where differential outcomes of psycho- 
therapy were causally related to different 
theoretic orientations to psychotherapy : cli- 
ent centered and analytic approaches were 
superior to leaderless-group and didactic 
approaches. Although in that study no 
specific conditions were explicitly manipu- 
lated, the present hypothesized therapeutic 
conditions were derived from client centered 
and psychoanalytic theory and are spe- 
cifically not common to leaderless-group and 
didactic or authoritarian orientations. 

At the very least, the present research, 
by demonstrating specific relationships be- 
tween therapist response characteristics and 
patient self-exploration and between group 
atmosphere characteristics and patient intra- 
personal exploration, provides an empirical | 
foundation for further theoretic speculations 
of causal relationships and focuses upon | 
dimensions fruitful for future experimental 
research. Also, at the very least, the present 
results are consistent with the a priori 
hypotheses which were based upon theory 
of the antecedent conditions for intraper- 
sonal exploration and constructive person- 
ality change. 


ДЖ 
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enhance this possibility through such phe- 
nomena as the halo effect, and the intro- 
duction of systematic bias due to theoretic 
preconceptions of the judges. 

It might be suspected, for example, that 
the obtained results are due to the assign- 
ment of high values of both "good" thera- 
pist behavior (high conditions) and of 
“good” patient behavior (high intrapersonal 
exploration) by the judges to group psycho- 
therapy samples that they consciously or 
unconsciously perceive as “good” therapy 
samples. 

The use of judges heterogeneous with 
respect to theoretic orientation and disci- 
pline was expected to minimize this possi- 
bility but cannot be used as an argument 
that such systematic bias did not occur. 
That such error does not account for the 
obtained relationships is demonstrated by 
the findings themselves. 

The finding obtained with the use of 
analysis of variance of multiple regression 
that seven different hypothesized conditions 
were necessary to account for variations in 
patient intrapersonal exploration clearly in- 
dicates that at least seven different char- 
acteristics of group psychotherapy were 
effectively discriminated by the judges, 
rather than simply a single “good” and 


‚ “Бай” dimension. Thus the possibility of 


judges rating samples on a single “good- 
bad” dimension is not supported by the 
data. 

Perhaps the most important findings with 
respect to arguing against the presence of 
systematic errors were those given in Table 
10. That table indicates that each of the 
rating scale measures of therapist conditions 
is related to the objective measure of thera- 
pist Responsivity differently for each of the 
three therapists. It seems necessary to con- 
clude that these relationships reflect per- 
sonality differences among the therapists. 
Now, since it would not be possible for the 
judges to identify the therapists from the 
transcript samples (even if they could be 
identified, would we expect all three judges 
to independently decide to arbitrarily assign 
lower conditions values to more responsive 
samples of Therapist A but higher condi- 
tions values to more responsive samples of 


Therapist B?) the consistent differential 
relationships between rated measures and 
the objective measure of Responsivity for 
the different therapists do tend to strongly 
argue against these findings being due to 
systematic rating errors. That is, since it is 
quite evident that the results presented in 
Table 10 are not conceivably due to sys- 
tematic rater errors, then it is reasonable 
to assume that other relationships involving 
the same measures are also not due to such 
possible errors. 


Generality of Findings 


The present study was aimed at "group 
psychotherapy" rather than the more ge- 
neric "group therapy" and was designed to 
investigate conditions related to patient 
intrapersonal- or self-exploration. It must 
be noted, then, that the present findings are 
not direcly applicable to group therapy 
where socialization or interpersonal inter- 
action are specified as the main vehicle of 
constructive personality change. Similarly, 
the present findings do not directly relate to 
didactic approaches. 

In short, the present results may be gen- 
eralized only to "psychotherapeutic" group 
psychotherapy in the classical definition 
where exploration of the patient's conscious 
and unconscious feelings and experiences is 
the method employed. In current practice 
this includes analytic group psychotherapy, 
client centered group psychotherapy, and 
eclectic approaches using techniques bor- 
rowed from both major approaches. 

Should it be felt that studies in the area 
of psychotherapy are peculiarly limited, it 
is well to remember the words of Karl 
Pearson (1898) : 
No scientific investigation is final; it merely 
represents the most probable conclusion which can 
be drawn from the data at the disposal of the 
writer. А wider range of facts, or more refined 
analysis, experiment, and observation will lead 
to new formulae and new theories. This is the 
essence of scientific progress. 


SUMMARY 


The primary purpose of the present re- 
search was to investigate the relations be- 
tween hypothesized therapeutic conditions, 
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derived from client centered, analytic, and 
small group theory, and three criterion 
measures of intrapersonal (self) explora- 
tion in group psychotherapy. Thus this 
study was designed to study relationships 
internal to the process of group psycho- 
therapy, and to contribute to the under- 
standing of the sources of variation in 
patient intrapersonal exploration associated 
with characteristics of the therapist'S re- 
sponses and with characteristics of the 
group interaction. 

Three heterogeneous group psychotherapy 
groups involving a total of 45 hospitalized 
mental patients led by three therapists dif- 
fering in their theoretical orientation to 
psychotherapy were selected to provide the 
population for study. Tape recordings were 
then obtained on 42 successive hours of 
psychotherapy from each of the three 
groups. From these recordings, transcrip- 
tions were made of 126 3-minute group 
interactions to provide the basic data for 
analysis. 

On the basis of pilot studies, rating scales 
were devised to quantify, within each inter- 
action sample, the presence of the following 
characteristics of the therapist's responses 
which were hypothesized to relate positively 
to the criterion measures: Empathic Under- 
standing, Accurate Empathy, Unconditional 
Positive Regard, Self-Congruence or Gen- 
uineness, Assumed Similarity of self and 
patients, and Leadership. Additionally, the 
frequency of therapist's responses within 
each sample was tabulated to provide a 
measure of a seventh hypothesized thera- 
peutic condition, that of therapist's Respon- 
sivity. 

Also, scales were devised to quantify, 
within each sample, the presence of the 
following hypothesized therapeutic char- 
acteristics of the group interaction which 
are relatively under the direct control of 
the therapist: Concreteness or Specificity 
of the group discussions, De-individuation 
of the group interaction, Empathic Under- 
standing by the group of its members, 
Unconditional Positive Regard by the group 
toward its members, Spirit of Cooperation 
and Mutual Helpfulness within the group, 
and Sociability of the group. 


The final three hypothesized therapeutic ~ 
characteristics of the group, which are only 
indirectly influenced by the therapist, were 
quantified by similarly devised scales: Self- 
Congruence or Genuineness of the patients 
in the group, Cohesiveness of the group, and 
Ego Involvement of the members, them- 
selves, in the group discussion. 

The test of the hypotheses was the sta- 
tistical significance of associations between 
each of the above hypothesized therapeutic 
conditions and the criteria measures of in- 
trapersonal exploration by the patient 
members. 

Since previous research has established 
some empirical validity for the Rogers and 
Rablen Process Scale and it has been dem- 
onstrated to differentiate more successful 
from less successful therapy cases, this scale 
was selected as the primary criterion meas- 
ure. The Process Scale is designed to 
measure both the amount and the depth or 
extent of intrapersonal exploration, so that 
it provides an overall measure. Two addi- 
tional scales were used to get at more 
specific aspects of intrapersonal exploration. 
First, a rating scale to quantify the develop- 
ment of perceptions of new relations be- 
tween old feelings or experiences (Insight 
Scale) was devised to measure this specific 
aspect of intrapersonal exploration within. 
each sample. Secondly, an entirely objective 
measure of the rate of personal pronouns | 
emitted by the patients in each sample ( Per- 
sonal Reference Scale) was used as the 
third criterion measure of intrapersonal 
exploration. 

A total of 13 judges, including psycholo- ~ 
gists, psychiatrists, and psychiatric social 
workers of heterogeneous theoretic orienta- 
tions rated the coded interaction samples. 

Correlations between the hypothesized 
therapeutic characteristics of the therapist’s - 
responses and the measures of intrapersonal 
exploration were computed, using all 126 
samples. The following therapist conditions - 
were significantly related to intrapersonal 
exploration of the patients: Accurate Em- 
pathy, Unconditional Positive Regard, бене 
Congruence or Genuineness, Assumed Sim- ү 
ilarity, Leadership, апа Responsivity. These | 
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relationships, with the exception of Self- 
Congruence or Genuineness, were essentially 
linear. Only a lack of Genuineness was 
associated with low values of intrapersonal 
exploration; there was no relationship be- 
tween intermediate and high values of Gen- 
uineness of the therapist and patient self- 
exploration. 

Further, the correlations obtained within 
each group indicated that these relationships 
held for each group psychotherapy group 
separately. These results, then, were taken 
as positive support for the hypotheses. 

Of the group characteristics relatively 
under direct control of the therapist, the 
following conditions were significantly re- 
lated to intrapersonal exploration : Concrete- 
ness or Specificity of group discussion, 
Empathic Understanding by the group of 
its members, Cooperative and Mutually 
Helpful group spirit, and Sociability. Nei- 
ther De-individuation nor group Uncondi- 
tional Positive Regard proved therapeu- 
tically relevant. With the exception of 
group Sociability, which was negatively as- 
sociated with the criterion, the obtained 
significant relationships were in the pre- 
dicted direction. 

All three of the hypothesized therapeutic 
characteristics of the group which are only 
indirectly influenced by the therapist (Gen- 
uineness or Self-Congruence of the group 
members, group Cohesion, and Ego Involve- 
ment of the group in the discussion) were 
significantly associated with patient intra- 
personal exploration in the predicted 
direction. 

By means of analysis of variance of mul- 
tiple regression the hypothesized conditions 
Which accounted for separate sources of 
variation in the criterion measures Were 
« determined. A multiple prediction equation 
accounting for over one-half of the total 
Variation in the Process Scale was obtained 
in which the following seven conditions 
accounted for significant amounts of sepa- 
rate sources of variation: therapist's Self- 


Congruence or Genuineness, therapist's 
Leadership, Concreteness or Specificity of 
the group discussion, group Empathic Un- 
derstanding of its members, Genuineness or 
Self-Congruence of the group members, 
Ego Involvement of the group members in 
the discussion, and group Cohesion. 

Again, one-half of the total variation in 
the Insight Scale was accounted for by the 
obtained multiple prediction equation. How- 
ever, Concreteness or Specificity of the 
group discussion alone accounted for a sig- 
nificant source of variance. That is, the 
other conditions which were found to be 
associated with the Insight Scale do not 
account for any additional variation in 
insight beyond that accounted for by the 
hypothesized therapeutic condition of Con- 
creteness of group discussion. 

A similar analysis using the Personal 
Reference Scale yielded results indicating 
that approximately one-third of the total 
variation in the rate of personal pronoun 
emission by the patients could be accounted 
for by only two separate sources of vari- 
ance: Concreteness or Specificity of the 
group discussion, and Genuineness or Self- 
Congruence of the group members in the 
therapy relationship. 

The necessity of the three therapist condi- 
tions specified by the current formulation 
of client centered theory was evaluated by 
means of analysis of variance of multiple 
regression using therapist Accurate Em- 
pathy, Unconditional Positive Regard, and 
Self-Congruence or Genuineness as the con- 
comitant variables, and alternatively, the 
Process Scale and the Insight Scale as the 
dependent variables. In both analyses the 
results indicated that only two, rather than 
three, of the conditions account for separate 
sources of variance: Accurate Empathy and 
Self-Congruence or Genuineness. E 

The results were interpreted and dis- 
cussed in terms of theoretical orientations 
to group psychotherapy with particular 
emphasis upon client centered theory. 
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APPENDIX 


SELECTED SAMPLES AND OBTAINED SCALE VALUES 


Number 4 

Patient 1: [Speaking rapidly] This is prob- 
ably of no interest to the rest of you, but is 
that the one who sent ... card about... “you 
tranquilize vours, ГЇЇ tranquilize mine ?" 

P+: I don't know—I didn't notice who sent it. 

Pı: It's up on the bulletin board. 

Therapist: . . . Y missed this... Whuh ... 
would you state that again? 

Pı: Well... on the bulletin board at 3-West 

. . which you people don't know about . . . is 
а card from Joan... 

Р.: [Interjecting] That's . . . who— 

Р\: It says “you tranquilize yours, ГЇЇ tran- 
quilize mine . . . inside are a bunch of drinks, 
oh, they’re martinis, I think... 

P:: Mmhmm. 

Pı: And [she laughs] the characters . . . 
it's all in... comic... form... I'm joking 

. it says "you tranquilize yours, ГЇЇ tran- 
quilize mine." [Laughter] 

P:: So, that's what... 

Ps: [Breaking in] Well, I think we all feel 

. somewhat resentful . . . uh—when we 
know that somebody goes out, and we feel that 
they're not really ready to step outside in... 
and live the way that they're, people are sup- 
posed to live, and we can't go, we know—we 
know sometimes that we're not really ready to 
go right then either . . . when we just can't 
understand how some people can go out, and 
all they're talking about before they ever get 
out is what a kind of a wild time they're gonna 
have as soon as they leave this place. 

T: Mmhmm. 

P:: [Speaking rapidly in short phrases] 
This is changing the subject, but is it...I 


know it... shows my stupidity . . . but is it 
really true that . . . some tranquilizers, some 
drugs ... would make a person . . . react . 


to the contrary and would make them more 
upset— 

T: [Interrupting] We were talking about 
that— 

Ps: —at different times? 

T: [He continues, ignoring the interruption] 
—a, a little bit, Helen, 1—1 don’t remember if 
you—uh— 

Pı: Say, if you took— 

T: [Ending]—When you first came in— 

Pı: [Continuing]—something when you 
were... 


P:; Mhmm. 

Pı: [Continuing]—carrying a baby and then 
you took something afterwards? 

P:: Well, Dr. Jones tried a Serpasil product 
and tranquilizer with me and the two of them 
reacted . . . favorably . . . but, when I took the 
Serpasil alone . . . it felt just like my breathing 
was being closed off, I couldn't breathe. 

Pı: Well, what— 

P:: What it does, I don’t know, but it does 
react on everybody differently. 


Number 16 

T: Mary is trying to tell us something here, 

Pı: Yes, Mary is telling us about— 

T: [Interrupting] And other people are try- 
ing to tell Annette something . . . and our - 
channels of communication are a little bit 
clouded. Barbara, you know what it is. 

P:: How did we get on the subject? 
[Laughter] 

P:: You see other people's faults . . . 


P:: [Interrupting] Introverts and  extro- 
verts ... 
Ps: .. . as faults of other people, right? 


P:: Well I assure you 7 never talked like 
Melinda Jones did. I can [nervous laughter] 
swear to that on a stack of Bibles. 

T: Do we have something of the—uh . . - 
Mary is saying, perhaps . . . that she wishes 
. .. she weren't talking so much either . . . 
right now, but that this group is kind of dull, 
. .. And you've got to talk, because you all 
just sit around and never do anything—you 
just smoke— 

P:: No, no, no, you are wrong on that. Iam 
not. The only,—I try to keep quiet. The only 
time I say something is when I felt I could 
make a contribution. Otherwise, I have tried 
to be as quiet as I could be. I wanted to come 
and listen, but Betty has been unusually quiet 
for I don't know how many classes. J 

T: And has been unusually hostile towards 
you, today. 

P:: Uhuh, mhm. [Laughter] 

T: [Pause] Now, Nancy has been rather 
hostile to you. 

Ps: Yes, [laughing nervously] rather openly. 

P:: [Calmly] And it doesn't bother me at all. 

Ps: [Surprised] It doesn’t? 

P:: No, it doesn't. 
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p» Ps: Now, I can't believe that. 

Ps: [Calmly] No, it does not. I have felt too 
much hostility in my life to be annoyed by it 
any more. 

Т: What were you going to say, Pat? 

Ps: Nothing. [Pause] I feel small, maybe. 

Р: I wish I had a good laugh. 

Pi: I feel small because I thought of some- 
thing that was brought up. It's [pause] oh, 
... I don’t know why ... some of us hang 

on to our sickness and everything. [Pause] It 
should be comparatively . . . easy to . . . pull 
yourself out . . . but it isn't. 

P; [In a calm lecturing voice] It certainly 
isn't. As the saying goes, ^You can say that 
again. And in a way, Dr. X, I feel that I 
have had so much more experience with this 
sort of thing, because I have been in so many 
sanitariums and gone through it so often. Not 
exactly the same thing, ‘cause it varies, it's 
just one thing this time, and one thing another. 
But wherever I have been, I have gotten well 
... by becoming interested . . . in other people, 
and benefiting . . . from their interest to me. 
— The very first time I was hospitalized, when 
- I was 25 .. . and I met a bishop's daughter, 
| ... She must have been at least twice my age, 
- and I learned a lot from her. 


Ca —— 


| Nwnber 18 
T: There's so much . . . hate. 
Ps: [Pause. Tearfully] Ive never hated 
| people in my life, not even . . . [patient stops, 
choked up, unable to proceed momentarily] 

... the only one that I’ve ever bothered to have 

the emotions—[Again pauses; group and ther- 

- apist seem to wait for patient to collect self] 

_ Гуе learned one thing since Гуе been here— 

that is, that my relationships with people... 

_ haven't been . . . what they . . . what other 

- people seem to have, with, with interaction with 

people, I’ve been kind of a “Touch-me-not” 
| Sort of a person. [Pause] Апа...зо... 

and I don't know how to handle it . . - 

T: Mhm [Pause. Therapist and group wait] 
[Softly] Sort of like иһ, “isolating oneself 
from other people.” 

Pı: [With deep feeling, sobbing] You're left 
awfully lonely, you learn to hate your own 
company . . . I just don't seem able to have 
_ any control over it! 

_ T: [Gently] How do you establish control 
Over something like this? [Long pause] Would 
you like to hear how other people, the ideas 
Other people have, how they control this, how 

| it ought to be controlled? 


Number 48 

Pi: That is another thing uh that made me 
uh, highly provoked so to speak, as far as I am 
concerned. I came out of service at 180 pounds— 
182 pounds. 

P:: What's that got to do with it? 

Pı: I could have, but I didn't want to, but 
I wasn't afraid of the guys in Janesville as far 
as, well, being popular. And I wasn't out to 
hurt the name of City of Janesville or anything 
like that. I just wanted uh, all the new ех- 
periences I could think of. 

Р: Tried to do everything you wanted huh? 
Well . . . I did the same thing. 

Pı: I wanted, uh, uh, to, to do everything 
on my own. I couldn't, I couldn't get along 
with the teachers. I, . . . wouldn't accept their 
instruction. I thought I was better than the 
teachers. [Therapist—Mhm] I admit it now, 
and uh before I couldn't. I didn't want to. 
I still thought I was... uh... better than 
anyone else, [Pause] [P:—I could see] Pos- 
sibly I couldn't be anything else—let's put it 
that way. I mean highly no good so to speak. 

P:; Well, you're not very old, what the 
heck, you could take off and— 

Pı: [Interrupting] I worry about that. I 
think if I was 60 years old, I would probably 
_. still... have [another patient sighs deeply] 
the same general type of thinking. 

T: [Tentatively] I’m not quite sure . . . 

Pı: That I’m a better man than my father. 

P:: What's your father got to do with this? 

Pı: Nothing, no more. 

P. That's as it should be, my father was 
an alcoholic, and I made up my mind when 
I was a—even when I started hearing things 
about it— 

Т: [Interrupting] It seems that for John 
though, . . . his father does . . . become very 
important in his thinking. 

Pı: After all I know about him. Not that 
Tm right but I think I’m right. 

P.: You still love your father, right? 

Pı: [Angry impatience] Yah, yah, yah, yeh. 
But that is as far as it goes, huh, I mean yah. 

P:: Yeh but you should still love him. 

Т: But... he still hates him. 

Pa: You shouldn't hate anybody, though 
[Therapist—l guess—] You should respect 
him. 

Pı: Pm very sorry for having hated him— 
for having to admit I hated him, [pause] be- 
cause... 0h... I don't know .. , maybe I 
read too many books. Dr. — —— [author of 
book] said you hate somebody in anger your 


34 CHARLES B. TRUAX a 


blood goes up you are ready for a fight, and 
things like that. It tears down your resistance 
internally and externally, and . . . your ability 
to get along with other people. 

Р»: My whole life changed, doc. I hadn't 
seen my father—I hadn't seen what he looked 
like, and I hated him. Like Al said, and 
when I did see him, I, I got to like him! And 
what are you going to do? [Pause] [Strained 
voice] The guy is an alcoholic, and all the rela- 
tions think he is no good, but yet, and yet 
you still respect the guy. [Therapist —Mhm. 
'cause] He has had a hard life and— 

Pi: [Angrily] You think I want everybody 
to feel sorry for me. 

Pı: No! 

Ps: Huhuh. 

Р\: [Angrily] Who wants a guy who comes 
from the slums of Janesville, you know. 

Р»: No. 

Ps: That's just your own thinking. 

Pı: Put him in here, and don't let him do 
as he wants. I was in and out. 

Ps: That’s not true. 

Pı: [Very agitated] Oh, he's out of the 
service and hasn't changed and, and all feel 
sorry for him, all the "goody" talk and you 
get a group so sympathetic for a guy, and all 
of a sudden he throws the whole community 
out of line. How do you think I feel? 

T: Feeling it was you that was throwing 
the whole community out of whack? 

Pi: Naht 

Ps: Well, how about John, the fellow that 
just come in here. He told me he was from 
Janesville, and he had a loaded .22 rifle and 
a pistol. And he actually threatened K. F. 
Jones, the principal of the high school back 
home. I thought I would knock him on his 
butt, but at the same time [Ps—You could 
understand it] I didn't want to. Not that I 
felt any better, . . . [Therapist—Mhm] and 
I am not any better than K. F. Jones but as 
a human being I wanted to be treated as one. 
[Pause] I mean, he's good or he wouldn't be 
in the position he's in. 

T: And yet he didn't really treat [P:—Yeh, 
didn't care] you as you hoped. 


Number 89 

T: It was almost as though you were re- 
sponsible for her heart attack. 

Pı: Well, yes. 

T: Because you... 

Pı: She had a heart attack like Гуе got a 
sore foot. [Scornfully] But the point is, Dr. 
Jones, at times I hate her, I hate her so... 


I can't. . . dare think about it but what Г 
get sick, [shouting] and yet she has many | 
good points, and my husband loves her, and 
he . . . she’s his mother. [Tenderly] We go 
there occasionally, not any more than I have 
to, but I’m friendly with her and she's very 
helpful if I am ill and we need, . . . something 
sent over in the line of food she's apt to 
send it, but иһ... another thing, I hate this 
constant “Poor James. He's had so much to 
put up with, with uh... Helen... with this | 
nervous condition she has. It's sad." [Mock- 
ingly] 

T: They take rather a . . . something of a* 
patronizing attitude toward you? You hate her 
and yet the things that she does are not such 
that you can really come out and express it 
because she presents it as though she's really 
helping you. [Haltingly] i 

Pı: Oh, several times I went to him and 
I said “We're going to take him, now that's 
all" I said, “he’s our . . . some of this is 
ridiculous," and even though I do have twoi 
other children, I... we want him. He's ours. 
And then maybe something would come up 
where . . . I wasn't able physically to take 
on that. Then we'd come to the point where 
we needed help. Well, naturally, they would 
offer to take David . . . [Long pause] I've 
often wondered if I blame myself because I 
didn't keep him. Maybe һе... maybe I felt 
that if I had had him, this wouldn't have 
happened, which is silly, because he, he had 
uh . . . virus pneumonia with measles and 
asthmatic condition which . . . [Voice fades] 
It wouldn't have made any difference. [De- 
jectedly] 

Т: But you feel guilty about that? 

Pı: I didn’t know I did. I... I... 
don't know if I do, but I know that it makes 
me very unhappy when I think about it. At 
the time of his death I thought I’d accepted it 
very well, and . . . I trained myself not to be 
overemotional about it or dramatize the situ- їй 
ation, I... I... I hated that sort of thing” 
where she kept his shoes under his bed for 
two years after he was dead, had pictures taken 
of him in his casket. [Sadly] Morbid things. 
[Emphatically] 4 

P:: Why do people do things like that? 

Pı: I don't know. [Yelling] ) 

P:: Take pictures of people in their caskets! 

Pi: I think that's the most horrible thing. 9 
And she gave me one of these enlarged . - 
pictures of my little boy, and I don't want 
I want nothing to do with it! [Rapidly] 
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! TABLE Al — 
j SCALE VALUES ASSIGNED TO SELECTED SAMPLES 
Conditions Sample | Sample | Sample | Sample Sample 
No.4 No. 16 No. 18 No. 48 No. 89 
Criteria: 
Process Scale 37 53 86 64 76 
- Insight Scale 2 5 5 5 6 
Personal Reference Scale 44 98 86 99 112 
i herapist Conditions: 
Empathic Understanding 3 4 9 7 4 
Accurate Empathy 1 2 8 8 8 
` Genuineness or Self-Congruence 6 7 9 6 6 
Unconditional Positive Regard 4 1 8 7 8 
Leadership 3 4 7 3 6 
- Responsivity 3 6 2 9 4 
Assumed Similarity 3 3 7 5 5 
i 
Group Conditions Under Control of Therapist: 
Concreteness or Specificity 3 3 6 7 8 
De-individuation 7 4 4 3 4 
Empathic Understanding 3 4 4 7 5 
Unconditional Positive Regard 5 4 5 4 5 
| Cooperative Spirit 3 3 3 5 3 
—  Sociability 5 5 5 3 4 
Group Conditions Indirectly Influenced by 
the Therapist: 1 
Genuineness or Self-Congruence 4 A 1 6 1 
Cohesiveness 5 5 6 7 
3 6 8 8 iE 
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Se people are very easily hypnotized, 
while others are extremely resistant. 
This common observation suggests that a 
study of the personality correlates of sus- 
ceptibility might prove of interest not only 
in explaining the nature of the hypnotic 
trance, but in telling something about the 
organization of personality, about self- 
control and persuasibility. 
Before attempting a study of the person- 
ality correlates we need to know whether or 
| not we are dealing with a stable and meas- 
- urable trait of hypnotic susceptibility. If on 
the one hand susceptibility fluctuates uncer 
tainly from day to day, from hypnotist to 
hypnotist, from induction method to induc- 
tion method, it would be futile to look for 
| stable personality characteristics lying be- 
hind whimsical responsiveness. If on the 
other hand susceptibility turns out to bea 
dependable characteristic, then a search for 
its conditions will be promising. The in- 
vestigation reported here is concerned with 
susceptibility as such. While personality 
manifestations were also studied, they are 
not being reported at this time. In one sense 
we are studying the criterion (hypnotic sus- 
ceptibility) to be used in the later analysis 
of predictions to be made from variables of 
nonhypnotic sort. In another sense, how- 
ever, this is an aptitude test, predicting how 
a person will respond in further work with 
hypnosis. 
Variability is found in all behavior. 
Hence in hoping that hypnotic susceptibility 


‚ This is a report of one of a series of investiga- 
tions conducted within the Laboratory of Human 

evelopment under a grant from the Ford Foun- 
dation to Robert R. Sears and Ernest R. Hilgard. 


may be relatively stable, we are not expect- 
ing it to be rigidly unchangeable. If there 
is a core of stability, we would still hope to 
show some circumstances under which sus- 
ceptibility will change, for example, with 
practice, with changes in motivation, with 
rapport better established between hypnotist 
and subject, possibly under the influence of 
drugs. 

What we shall mean by hypnotic suscep- 
tibility for the purposes of our investigations 
is a relatively persistent tendency to yield 
the phenomena historically recognized as be- 
longing to the hypnotic trance, when the 
opportunity to yield these phenomena is 
given under standard conditions. We em- 
phasize the importance of standard proce- 
dures at this stage, for without them we 
cannot hope to demonstrate such lawfulness 
as may exist. These conditions will not be 
optimal for all subjects, but later investiga- 
tions сап tell whether or not the conditions 
we have chosen are representative ones. By 
using such a procedure we can assign some 
sort of numerical value that will state the 
degree of susceptibility of one person rela- 
tive to another under these standard condi- 
tions. By repeating the procedures on more 
than one day, with alternate forms, the sta- 
bility of the measurement can be determined 
in the form of a reliability measure. 

The importance of such measures for 
carrying on the study of personality corre- 
lates is evident. Tt is futile to attempt pre- 
dictions if there is nothing measurable and 
stable to predict. But there are many other 
purposes for such measures. Now that hyp- 
nosis is being used widely in medical prac- 
tice, as in dentistry, obstetrics, surgery, 
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psychotherapy, it is desirable to know 
whether or not the hypnotic method is ap- 
propriate for a given subject. We do not 
now know, for example, how much sus- 
ceptibility is desirable for the hypnotic 
method to be efficacious for these various 
purposes. It is quite possible, for example, 
that a very slight amount of susceptibility is 
all that is necessary for some psychothera- 
peutic uses of hypnosis. Having a standard- 
ized scale will help to make our knowledge 
more precise. It may also turn out that with 
the high motivation of a person seeking 
relief from pain, hypnotic susceptibility will 
increase. If this should be the case, it will 
give us important information about the 
nature of susceptibility. All of this needed 
information depends upon sound measures 
of susceptibility. 

If our scores correlate with such meas- 
ures as hypnotic amnesia, hallucination, and 
the carrying out of posthypnotic sugges- 
tions, we shall assume that they are sampling 
susceptibility to hypnosis. Low scores on 
our scale will be made by people who give 
some of the responses commonly used in 
tests of waking suggestion, while giving 
few, if any, of the responses more particu- 
larly associated with the recognized hyp- 
notic trance; high scores will be made by 
those who give more of the typical trance 
phenomena. To serve as a predictive scale 
our scores need only indicate that the sub- 
jects with high scores are good candidates 
for hypnosis; for this purpose they need 
say nothing about how deeply the subjects 
have already been hypnotized. 

A susceptibility scale can be validated by 
bringing back subjects who score high and 
low and then attempting to elicit from them 
other kinds of trance phenomena. Some 
experiments along these lines will be re- 
ported indicating that the scale appears to 
be valid as well as reliable. 

Before presenting the data from our in- 
vestigation we shall review some of the his- 
tory of the problem of hypnotic suscepti- 
bility, along with the previous attempts to 
measure individual differences in suscepti- 


bility. 


PRIOR INVESTIGATIONS OF SUSCEPTIBILITY 


The conception that hypnosis occurs in 
degrees has been forced upon all investiga- 
tors, even though their definitions of hyp- 
nosis, or their preferences regarding it, may 
have led them to wish for all-or-nothing 
manifestations. Braid (1843) characterized 
the true state of hypnotic sleep according to 
complete spontaneous amnesia for all events 
occurring during the trance, but he was 
troubled by finding many of his patients 
helped by his procedures even though they 
failed to meet his criterion. In the end, he # 
stood by his guns, saying that his patients 
were in other states, but not truly hypnotized 
unless completely amnesic. Charcot (1882) 
and his co-workers Richer (1885) and 
Gilles de la Tourette (1889) specified three 
kinds of hypnotic state (catalepsy, lethargy, 
and somnambulism). These were thought 
of as discrete, with sharp transitions be- 
tween them. It was an easy further step, 
however, for writers such as Pitres (1891), 
influenced by Charcot, to add other border- 
line, mixed, and incomplete states (états 
frustes). The implication is still that of a 
mixture of states, rather than a true con- 
tinuum, but once there are enough border- 
line conditions there is little distinction be- 
tween a mixture and a continuum. 


Nineteenth Century Depth Scales 


The analogy with sleep makes the notion 
of degrees of hypnosis, expressed as degrees 
of depth, a plausible one. This manner of 
thinking seems to have been first proposed 
by Richet (1884). He recognized that the 
induced somnambulism of the mesmerists 
was the same as that produced by other 
methods, and he rejected the animal mag- 
netism explanation. Three degrees that he 
recognized were: (a) torpor, in which the 
eyes close spontaneously and can be opened 
with great difficulty, if at all; (5) excita- 
tion, with total inability to open the eyes, 
unresponsiveness except to the hypnotist, 
some “automatism” and “double-conscious- 
ness”; (c) stupor, with previous phe- 
nomena in greater degree, spontaneity to- 
tally lacking, subject a complete automaton, 
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easily produced “contractures” and “cata- 
lepsies.” There is usually amnesia in the 
second stage, more complete amnesia in the 
third. Here we have the beginning of a 
depth scale. 

Not long afterwards, Liébeault (1889) 
proposed a six-point scale, and Bernheim 
(1891) a nine-point scale. These are sum- 
marized in Tables 1 and 2. Liébeault felt 
that his scale was unidimensional in the 
sense used much later in Guttman-type 
scales, that is, that an individual who 
showed the symptom characteristic of one 
of his degrees of depth would always show 
all the symptoms of lesser degree. Both 
scales emphasize spontaneous amnesia as a 


TABLE 1 


DEPTH or Hypnosis ACCORDING TO 
LIÉBEAULT (1889) 


Light sleep 
1. Drowsiness. Torpor, drowsiness, heavi- 
ness, of the head, difficulty in 


opening the eyes. 


2. Light sleep. Above signs plus catalepsy, 
but with ability to modify 
the position of members if 
challenged. 

3. Light sleep: Numbness, catalepsy, auto- 

deeper. matism. The subject is no 


longer able to interfere with 

rotary automatism.* 

In addition to catalepsy and 

rotary automatism, the sub- 

ject can no longer attend to 

anything else but the hypno- 

tist and has memory only for 

the interchange between 

them. 

Deep or somnambulistic sleep 

5, Ordinary som- Total amnesia on waking. 

nambulistic Can have hallucinations dur- 
sleep. ing sleep. Hallucinations van- 

ish with waking. Subject 

submits to the will of the 

hypnotist. 

Total amnesia on waking. 


4. Light sleep: 
intermediate. 


6. Profound som- 


nambulistic Hypnotic and posthypnotic 
sleep. hallucinations possible. Com- 
plete submission to the hyp- 

notist. 
ity, in which the arms 
a Catalepsy refers to waxy flexibility, im iss p. thes нра 


remain where they are placed. Rotary automa’ 
Dersistence of rotary movement ‘of the hand and forearm, once 
Set into motion by the hypnotist. 


TABLE 2 


DEPTH or Hypnosis ACCORDING TO 
BERNHEIM (1891) 


Memory retained on waking 

Degree 1. Torpor, drowsiness, or various sug- 
gested sensations such as warmth, 
numbness, 
Inability to open the eyes if challenged 
to do so. 
Catalepsy suggested by the hypnotist 
and bound up with the passive condi- 
tion of the subject, but may be coun- 
teracted by the subject. 
Catalepsy and rotary automatism 
which cannot be counteracted by the 
subject. 
Involuntary contractures and anal- 
gesia as suggested by the hypnotist. 
Automatic obedience; subject behaves 
like an automaton. 


Degree 2. 


Degree 3. 


Degree 4. 


Degree 5. 


Degree 6. 


Amnesia on waking 
Degree 7. Amnesia on waking. No hallucinations. 
Degree 8. Able to experience hallucinations dur- 
ing sleep. 
Able to experience hallucinations dur- 


Degree 9. 
ing sleep and posthypnotically. 


characteristic of the deeper stages. This is 
equally true for Bernheim, despite his theo- 
retical position that all phenomena of hyp- 
nosis are the result of suggestion. Perhaps 
he interpreted amnesia as a natural accom- 
paniment of other suggestions, although it 
was not itself suggested. 

With the appearance of such scales it be- 
came meaningful to speak of the distribu- 
tion of susceptibility according to the depth 
of hypnosis that could be reached. It is pos- 
sible to make a distinction between suscepti- 
bility and depth, but as a beginning it was 
natural to define susceptibility by the great- 
est depth that the subject was able to attain. 

Many of those who worked with hypnosis 
were satisfied with simpler classifications of 
hypnotic states. A common three-point scale 
distinguished between “somnolence,” “light 
sleep,” or "hypotaxy," and “deep sleep” or 
“somnambulism.” This classification was 
used by Forel, Loewenfeld, Fontan, Ségard, 
and Ringier. Others preferred a twofold 
classification, whereby individuals fell in 
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Group T if only their motor behavior was 
affected, and into Group II if in addition 
they yielded also perceptual and ideational 
changes. Gurney, Delboeuf, Hirschlaft, and 
Dessoir preferred this scheme. Mayo 
(1852) made a distinction between "wak- 
ing" and "sleep" behavior. 

These nineteenth century scales have 
enough in common that it is possible to 
make some comparisons among the findings 
of the various authorities. АП, for example, 
give a good deal of weight to spontaneous 
(“nonsuggested”) posthypnotic amnesia as 
a criterion of deep hypnosis. Other stages 
are usually described according to classes of 
events, rather than according to specific 
tests, so that there is an element of uncer- 
tainty about borderline states. Induction 
procedures were not standardized, except 
within master-disciple groups, and there 
was always a certain amount of accepted 
folklore. For example, it was assumed by 
many hypnotists that hallucinations were 
produced by a simple posturing of the sub- 
ject, without verbal suggestions of hallu- 
cination. The word "suggestion" to one 
hypnotist might mean a verbal command, 


while to another it might mean a nonverbal 
suggestion produced by some sort of manip- 
ulation. Thus in comparing the distributions 
of susceptibility as reported by these early 
writers, one naturally must recognize large 
elements of uncertainty in making quantita- 
tive comparisons. 

Tt comes as something of a surprise to 
find the very large numbers of subjects for 
whom records were kept and reported in the 
latter part of the nineteenth century. In 
Table 3 we have digested the results from 
two major reviews (Loewenfeld, 1901; 
Schmidkunz, 1894) adding some cases re- 
ported a little later by Bramwell (in 1903). 
The 14 summarized distributions in this 
table are based on records from 19,534 
patients—a very substantial number, even 
with allowance for some duplications in the 
reports. There are included only those re- 
ports which permitted classification (always 
with a margin of uncertainty) into refrac- 
tory or nonsusceptible subjects and three 
degrees of susceptibility: drowsy-light, 
hypotaxy-moderate, and somnambulism- 
deep. Because the conditions of each in- 
vestigation differ, we have reported the 


TABLE 3 
DISTRIBUTION OF SUSCEPTIBILITY TO HvPNOsIS: NINETEENTH CENTURY STUDIES 
Distribution of susceptibility (in percent) 
Sessions | Cases 
Investigator Source Date range (N) (N) Refrac- Hypo- 
tory: | Drowsy- | taxy- |Somnam-| Total 
nonsus- | light |moder-| bulistic- | suscep- 
ceptible ate deep tible 
Peronnet a | ante-1900 467| 25 10 20 45 15 
Forel a | ante-1898 275 | 17 23 37 23 83 
Lloyd-Tuckey а  |ante-1900 220| 14 49 28 9 86 
Bramwell А b | ante-1900 | 4-76 | М=23 200| 11 24 26 39 89 
Von Schrenck-Notzing a ante-1900 240 12 17 42 29 88 
Mosing c | 1889-93 afew | M=20-30 594 | 12 42 17 29 88 
children 
Hilger a  |ante-1900 351 6 20 42 32 94 
Von Schrenck-Notzing a |1892 8,705 6 29 50 15 94 
(pooling of 15 reports) 
Liébeault a | 1884-89 7-63 2,654 5 22 62 11 95 
V. Eeden & a | 1887-93 1,089 5 43 41 iL 95 
y. Renterghem 
v. Renterghem а  |ante-1900 М 414 4 52 33 11 96 
Wetterstrand a |1890 Failures, 3,209 3 36 48 13 97 
or 
trials 
Velander a ante-1900 1,000 2 32 54 12 98 
Vogt a | апќе-1900 116 0 2 13 85 100 
То{а1 савез 19,534 
Range of percentages 0-25 2-52 | 13-62 | 9-85 | 75-100 
Mean of percentages 9 29 36 26 91 


a Loewenfeld (1901). 
b Bramwell (1903). 
e Schmidkunz (1894). 
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means of the investigations without respect 
to the variation in numbers of cases, thus 
using each report as one case in computing 
the means at the bottom of Table 3. 


More Recent Scales of Hypnotic 
Susceptibility 


A new interest in hypnotic susceptibility 
came to the fore around 1930, when M. M. 
White (1930) published his scale. He made 
use of specific responses to suggestions 
given in hypnosis as a means of arriving at 
scores, and thus began a practice adopted 
by most of the later scales. Shortly there- 
after the Davis and Husband (1931) scale 
appeared, which, while more detailed and 
covering a wider range of depth, assigns 
scores on the basis of responses to classes 
of suggestions rather than to specific re- 
sponses. At about the same time Barry, 
MacKinnon, and Murray (1931) proposed 


a scale based on a short list of specific sug- 
gestions. They placed much weight upon 
the subject's ability to have some suggested 
posthypnotic amnesia, and upon suggested 
inhibition of response, that is, loss of ability 
to control certain types of movement, such 
as separating interlocked fingers. While 
Hull (1933) did not develop a scale of sus- 
ceptibility, he often used speed of eye clos- 
ure to suggestion as a measure of suscepti- 
bility. The well-known scale of Friedlander 
and Sarbin (1938) combines this emphasis 
upon eye closure with the kinds of items 
used in the Barry, MacKinnon, and Murray 
scale. The scale developed by Eysenck and 
Furneaux (1945) is similar in many re- 
spects to that of Friedlander and Sarbin, 
while the scales of LeCron and Bordeaux 
(1947) and of Watkins (1949) are more 
nearly variations of the Davis-Husband type 
of scale. Of these, the scale by LeCron and 


TABLE 4 


DISTRIBUTION OF SUSCEPTIBILITY TO HYPNOSIS: 


More RECENT STUDIES 


Distribution of susceptibility (in percent) 
Ses- E 
Investigator | Date | Subjects | sions | Cases Refrac- Hypo- | Somnam- otal 
(N) (N) tory Drowsy- | taxy- bulism- | suscep- 
nonsus- light moder- deep tible 
ceptible ate 
Eysenck & Neurotic 
a Furneaux 1945 | patients 1 60 Er 38 17 8 63 
Friedlander College 
a Sarbin | 1938 | students | 1 57 33^ 25 37 5 67 
eitzenhoffer College 
1956 | students 1 200 23» 59 15 3 77 
Barry, 
MacKinnon,| 1931 | College 84 
& Murray students 1 73 16° 37 29 18 
Hilgard, 
Weitzen- 
hoffer, & 1958 | College 16 97 
Gough students 1 14 3b 51 30 
Total 464 H 5 
Misc octo E 7 zs a 3 e n 
Mean of percentages 


x Nonsusceptible include lowest category reported, scores 0-10 on 


remaining scale (i.e., 10-80). д 

^ Converted according to Friedlander-Sarbin practice, 
medium — 10-14, deep — 15-20. й 

* Converted according to most comparable categories: 


with nonsuscepti! 


an 80-point scale; others categorized according to thirds of 


ible scoring 0 on Friedlander-Sarbin scale, light = 1-9, 
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Bordeaux is of great length, and covers a 
very large variety of hypnotic phenomena. 

The most widely used of these scales have 
been those of Davis and Husband and of 
Friedlander and Sarbin. Because descrip- 
tions of these scales are readily available 
elsewhere (e.g., Weitzenhoffer, 1957) they 
will not be repeated here. 

Results of investigations using the more 
modern scales are presented in Table 4. For 
this purpose, responses have been reclassi- 
fied in the categories of Table 3, with every 
effort to be as fair as possible to the new 
and old conceptions. Comparison of 
Tables 3 and 4 shows that the ranges within 
each category of depth overlap substan- 
tially. Recent investigators report a higher 
mean percentage of refractory subjects and 
a lower mean percentage of somnambulistic 
subjects, but the orders of magnitude are 
similar. The older investigators were con- 
cerned primarily with clinic patients, for 
whom the motivation for successful hyp- 
nosis was very high, while the modern in- 
vestigators (with the exception of Eysenck 
and Furneaux) used college students. There 
were some children, also, in the older sam- 
ples, and as we shall see, they are more sus- 
ceptible than adults. Also, of course, the 
older studies often used many sessions. In 
any case results are sufficiently variable, 
even among the three studies using a mod- 
ern scale such as the Friedlander-Sarbin 
one, that further studies are clearly in order. 


Form of Distribution 


The average results, when classified ac- 
cording to the nineteenth century categories, 
suggest a fairly normal distribution of hyp- 
notic susceptibility, with a few refractory 
cases, a few very good subjects, and the rest 
falling between. When, however, results are 
scaled according to standardized measure- 
ments, various forms of distribution are re- 
ported. Perhaps the most commonly re- 
ported distribution is that of an inverted J, 
with most subjects relatively little suscepti- 
ble, and a pronounced skew of the distribu- 
tion toward the more susceptible end of the 
scale (eg., Eysenck & Furneaux, 1945; 
Friedlander & Sarbin, 1938; Hilgard, 


Weitzenhoffer, & Gough, 1958; Weitzen- 
hoffer, 1956). With a somewhat different 
scaling of scores, however, a bimodal distri- 
bution may result (Hilgard et al., 1958). 
There are traces of bimodality in a number 
of previously published investigations (e.g., 
Barry et al, 1931; Davis & Husband, 
1931). The uncertainty about the form of 
distribution raises a number of questions, to 
which we shall return. 


Age and Susceptibility 


There is a good deal of incidental evi- 
dence that children make unusually good 
hypnotic subjects (e.g, Bramwell, 1956) 
but careful investigations are lacking. The 
one best investigation was carried out by 
Liébeault (as reported by Beaunis, 1887). 
He studied 744 subjects ranging from early 
childhood to old age, with the results shown 
in Figure 1. Of his child subjects below the 


AGE AND SUSCEPTIBILITY 


(LIÉBEAULT) 

AGE o по OF CASES 
Below 7 23 

7-14 65 
14-21 WA 87 
21-28 Еа ОШАНИН 95 
28-35 [esa were es ЕАР ЗЕЛ СУ, 84 
35-42 7/7, 85 
42-49 арра 106 
PURUS — teen ey ay 68 
sec ШЫ. Сс  — — @ 69 
638 up TARE ee Much ЕБАТИ rri. 


o 100 №744 


20 40 60 80 
PER CENT OF SUBJECTS 


ШЕШ Somnambulism 
[J Somnolence to very deep sleep 


ЕБ Refractory (uninfluenced) 


Fig. 1. Age and susceptibility to hypnosis, Data 
from Liébeault as reported by Beaunis, 1887. 


age of 14, none was uninfluenced, compared — 
with about 10% uninfluenced at most other 
ages. The highest proportion of somnam- 
bulistic subjects was found between the 
ages of 7 and 14 (55.3%). There is little 
progressive change in susceptibility beyond | 
the age of 14. The conclusions of Ringier 
(Schmidkunz, 1892) and of Mosing 
(Schmidkunz, 1894) support Liébeault's 
findings. 

While comparable data are not available 
from more recent studies, one of Hull’s stu- 
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dents, Ramona Messerschmidt, reported 
data on postural sway for a sample of chil- 
dren between the ages of 5 and 16. Because 
postural sway shows a positive correlation 
with hypnotizability, these data are relevant. 
She found an increase in responsiveness 
from age 5 to high points at ages 6 and 8, 
with slow decline thereafter (Hull, 1933, 
p. 84). Her results are also coherent with 
those of Liébeault. 


Sex and Susceptibility 


Perhaps because of the early association 
of hypnosis with hysteria, and of hysteria 
with the female, there is a popular miscon- 
ception that women are more readily hyp- 
notized than men. The problem has been of 
continuing interest, and there have been 
numerous reports concerning sex differ- 
ences, Wetterstrand, (Loewenfeld, 1901) 
felt there were no sex differences, a conclu- 
sion with which Bramwell (1956) agreed. 
Liébeault reported his findings carefully, 
finding no sex differences. The occasional 
reports of differences favoring women have 
found differences too slight to meet tests of 
statistical significance (Davis & Husband, 
1931; Friedlander & Sarbin, 1938; Weitzen- 
hoffer & Weitzenhoffer, 1958). One excep- 
tion is the report of Hilgard et al. (1958) 
reporting a statistically significant differ- 
ence, with women more susceptible. 

Not only are mean sex differences seldom 
found, but the distributions are Very similar 
for men and women, even when circum- 
stances of experimentation and measure- 
ment produce distributions differing in gen- 
eral form from one study to another. 


Summary 


Differences in susceptibility to hypnosis 
have been found by all workers. Although 
the detailed findings are in disagreement, 
the nineteenth century studies, carried on 
with thousands of patients, are in general 
agreement with the more recent laboratory 
studies done largely with college students. 
The earlier studies tend to report somewhat 
higher average success, but many factors 


can account for this: the motivation of the 
subjects, the inclusion of children, repeated 
hypnotic sessions. The basic phenomena 
considered to be signs of the hypnotic trance 
are very much today what they were then. 
A particular thread of continuity is the use 
of posthypnotic amnesia as a sign of a sub- 
stantial trance. 

The early study of Liébeault on age dif- 
ference remains essentially sound, although 
further studies should be conducted. It ap- 
pears from what evidence is available that 
hypnotic susceptibility is at its height some- 
where between the ages of 7 and 14. 

Sex differences are not at all prominent, 
if indeed they can be demonstrated at all. 
There may be a slight tendency for women 
to be more susceptible than men, but if the 
tendency exists it is slight indeed. 

There are other problems of hypnotic 
susceptibility, such as the dimensions of 
hypnotizability. These we shall postpone 
for later consideration. 


Sranrorp Hypnotic SUSCEPTIBILITY 
SCALE 


A long-range study of hypnotic suscepti- 
bility was begun at Stanford during the 
academic year 1957-58. A preliminary study 
of individual differences in susceptibility, 
based on the scores of 74 subjects ( Hilgard 
et al., 1958) pointed up the need for some 
revisions of the scale then used, a slight 
modification of the Friedlander-Sarbin scale 
(1938). A revision of the scoring weights 
led in that study to a marked bimodality of 
scores, a result that it was feared might 
have been due to the nature of the items in 
the scale. Also the very low scores (of 0 
and 1) were so frequent that it seemed 
desirable to add some easier items in order 
to spread out the score distribution. 

With this background a new susceptibility 
scale was prepared and after some pretest- 
ing in the summer of 1958 a standardization 
test was begun in the autumn of 1958. This 
report is based on the results of 124 sub- 
jects tested with the new scale. Because 
additional scales are in preparation, the 
initial scale is known as the Stanford Hyp- 
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TABLE 5 
TEMS IN THE STANFORD HYPNOTIC SUSCEPTIBILITY SCALE 
Item Form A Form B Criterion of passing 
Postural sway Backwards Backwards Falls without forcing 
Eye closure Form A Induction Form B Induction Closes eyes without forcing 
Hand lowering Left Right Lowers at least 6 inches by end of 
10-second timed interval 
Immobilization Right arm Left arm Arm rises less than 1 inch in 
10-second timed interval 
Finger lock Before chest Overhead Incomplete separation of fingers 
at end of 10 seconds 
Arm rigidity Left arm Right arm Less than 2 inches of arm bend- 
ing in 10 seconds 
Hands moving Together Apart (A) Hands as close as 6 inches; 
(B) Hands at least 6 inches apart 
Verbal inhibition | Name Hometown Name unspoken in 10 seconds 
Hallucination Fly Mosquito Any movement, grimacing, ac- 
knowledgment of effect 
Eye catalepsy Both eyes closed Both eyes closed Eyes remain closed at end of 
10 seconds 
Posthypnotic Changes chairs Rises, stretches Any partial movement response 
at posthypnotic signal 
Amnesia test Recall of items—3 to 11 | Recall of items—3 to 11 | Recall of 3 or fewer items 


notic Susceptibility Scale which is Part I of 
` the total Stanford Hypnotic Scales. Part П 
will be a scale sampling a greater variety of 
hypnotic phenomena, for subjects who 
prove susceptible on Part I. The suscepti- 
bility scale has been separately published 
(Weitzenhoffer & Hilgard, 1959). The pub- 
lished version gives two forms of the scale, 
with complete directions, and some prelim- 
inary standardization data. Hence the de- 
tailed instructions will not be repeated here. 

The 12 items that receive scores in the 
scale are summarized in Table 5. An addi- 
tional item (arm catalepsy after passive lift- 
ing of the arm) was tested but dropped 
from the final scale because of its failure to 
correlate with the other items. 

Equivalence of the Two Forms. For the 
purpose of standardization, the two forms 
of the scale were given, half in the order 
A-B and half in the order B-A. There were 
no significant differences between forms, or 
between the scores on 2 days of hypnosis. 
For most of the analyses of this report the 
two forms will be considered as merely two 


halves of one total test, and the orders in 
which the forms were given will be ignored. 

Reliability. The reliabilities as determined 
for each day separately, for the total 2-day 
score, and the retest reliability, are given in 
Table 6. Because the reliabilities for 1-day 
scores are .83, the retest reliability, on the 
assumption of perfect correspondence be- 
tween the 2 days, should also be .83, as in- 
deed it does turn out to be. Also, if the 
score of each of the days is considered to 
be but half of a total test, the Spearman- 
Brown formula would predict a reliability 
of the total 2-day test of .91, as indeed it is 
found to be by the Kuder-Richardson for- 
mula (their Formula 20). Hence the inter- 
relationships of the coefficients of Table 6 
are consistent with the interpretations being 
made of the test results. The most satisfac- 
tory reliability is obviously that of the total 
2-day scores (.91). This is high enough to 
permit the use of these scores as criteria for 
other purposes. 

The changes in score between the 2 days 
are of some interest. Therefore a scatter- 
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TABLE 6 


RELIABILITY COEFFICIENTS: STANFORD HYPNOTIC 
SuscEPTIBILITY SCALE 


Cases | Reliability 
Type of reliability: (N) | coefficient 
Kuder-Richardson 
Day 1 (Forms A,B) 124 .83 
Day 2 (Forms А,В) 124 .83 
Days 1 4- 2 (Forms A 4- B) 124 .91 
Retest 
Day 1 (Form A) vs. Day 2 
(Form B) 60. .78 
Day 1 (Form B) vs. Day 2 
(Form A) 64 .87 
Day 1 (Forms A,B) vs. Day 2 
(Forms A,B) 124 .83 


plot of the 2-day performances is given in 
Figure 2. Analysis shows that 68% of the 
cases have a second day's score within one 
point of the first day, and 93% have scores 
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DAY I 


Fic. 2. Scatterplot of scores on 2 days of hyp- 
nosis. (The days followed each other at 24- or 48- 
hour intervals; no differences attributable to the 
interval were detected.) 


on the 2 days within three points of each 
other, 

Validity. The validity of the scale has 
been tested by bringing back to the labora- 
tory subjects from both the high and the 
low end of the distribution. 


The Stanford Hypnotic Susceptibility 
Scale is the first of several scales being de- 
veloped in this laboratory. For convenience 
we may refer to it as Part I, and talk about 
the later scales as Parts II and III. The 
next scale beyond Part I is designed to 
sample a more varied set of behaviors under 
hypnosis, so that, in effect, it will spread out 
the better subjects of Part I into those who 
can go much further and those who have 
gone about as far in hypnosis as they are 
ready to go. Although this scale, known as 
Part II—Extended Susceptibility Scale, is 
undergoing revision, a preliminary form 
was being tested in 1958—59, when the pres- 
ent sample was collected. Its general nature 
can be inferred from the 15 items that 
entered into the scores of this preliminary 
form. Each of these items was scored on a 
pass-fail basis, just as in the scoring of 
Part I: 


. Inability to stand up 

. Anesthesia (hand) 

Taste hallucination 

Smell hallucination 

Heat illusion 

Music hallucination 

. Visual hallucination of sport event 

. Regression to a recently seen motion picture 


. Dreaming under hypnosis 
10. Visual hallucination of record player (eyes 


open) Ve | 
11. Suggested deafness : inability to hear tapping 


sounds teh m. 
12. Negative visual hallucination: missing clock 


hand : 
13. Posthypnotic suggestion 


14. Amnesia 
15. Reinduction of hypnosis at a signal 
In connection with the development of 
the scales of Part II, we sought to rehyp- 
notize as many as possible of the subjects 
who scored high on the susceptibility scale. 
We succeeded in bringing back for further 
study 21 of 28 or 75% of the highest scor- 
ing subjects, all having scored 16 points or 
more of the possible 24 points on the first 2 
days. Of these 21 subjects, 15 (72%) 
scored in the upper half of the Part II scale 
(Table 7). We also tested a few subjects 
who had not scored as high on the suscepti- 
bility scale, but were potentially promising 
because they had shown considerable am- 
nesia. There were 21 of these subjects, rep- 
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TABLE 7 


RELATIONSHIP BETWEEN Scores ON PART I 
(SuscEPrIBILITY SCALE) AND Scores 
on Part II (Юертн SCALE) 


Part II (Depth) 
Part I (Susceptibility) 
Low | High 
(Days 1 and 2 combined) | scores | scores Total 
(0-7) | (0-14) 
High scores (16-24) 6 15 21 
Medium scores (7-15) 16 5 21 
Total 22 20 42 


Chi square = 9.55; .01 > p > .001. 


Note,— The Depth Scale is not yet in form for publication. 


resenting a selected third of the 62 subjects 
scoring between 7 and 15 points out of the 
24 possible on the first 2 days. Of these 22 
subjects, only 5 (2496) scored in the upper 
half on Part II. We thus see that the better 
subjects on the susceptibility scale were a 
more promising pool for providing good 
subjects on Part II than the poorer subjects 
on the susceptibility scale. This gives us 
some confidence in the validity of the meas- 
ures, although the occasional exceptions re- 
quire further study. We shall turn later to 
some questions of dimensionality raised by 
these exceptions. 

A number of subjects who scored very 
low on the susceptibility scale were also in- 
vited back individually for a further hynotic 
session. In these cases the hypnotist de- 
parted entirely from standard procedures, 
and did his best to capitalize on the known 
successes of the subject on the prior days, 
and on any cues picked up during the at- 
tempted induction. Review of the protocols 
of the 17 subjects tested in this way (sub- 
jects who scored 0-11 on 2 days, with а 
median of 7 out of a possible 24), showed 
performances on this third day with no 
greater success than would be predicted 
from the first day scoring level. There were 


2 We wish to acknowledge our indebtedness to 
Jay D. Haley of the Palo Alto Veterans Adminis- 
tration Hospital for assistance in this portion of 
the experiment. 


two doubtful cases, in which responsiveness 
appeared to increase, but neither of these 
cases demonstrated amnesia or responded to 
posthypnotic suggestion. Hence it appears 
that subjects refractory on the susceptibility 
scale, using the standard method of induc- 
tion, were also refractory when another 
method was used. Generalization from this 
finding is limited because only one addi- 
tional hour was used in the attempted rein- 
duction. 

These results give us some confidence 
that our susceptibility scale is both reliable 
and valid. The results have to be inter- 
preted with some caution because they are 
limited to a sample of college students, no 
long-term hypnosis was tried, and we did 
not manipulate the motivation of our sub- 
jects. But within these limitations, we have 
a scale that is dependable enough to be used 
in the study of the lawfulness of hypnotic 
phenomena: 


POPULATION STUDIED 


The 124 university undergraduates (64 
men and 60 women) whose scores are here 
considered were volunteers fulfilling part of 
the course requirement in introductory psy- 
chology at Stanford. We need to ask: What 
kinds of students come to Stanford? How 
representative are our subjects of the Stan- 
ford population ? 

As a private university with high admis- 
sion standards and high tuition fees, Stan- 
ford attracts students who represent a selec- 
tion largely from upper middle class homes, 
although the prevalence of scholarships and 
work opportunities means that there are 
many bright students from lower socio- 
economic strata as well. Stanford is coedu- 
cational, with some 5,000 undergraduate 
students of whom about 1,700 are women. 

The course in introductory psychology is . 
a popular one, so that about two-thirds of 
the students take the course at some time 
during their undergraduate years. It is 
chosen by engineering and other preprofes- 
sional students as well as by the liberal arts 
students. Comparison of the majors of our ' 
hypnotic subjects with the distribution of 
majors in the university at large shows that 
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UNDERGRADUATE Majors or Hypnotic SUBJECTS COMPARED WITH DISTRIBUTION 
ОЕ MAJORS IN THE TOTAL UNDERGRADUATE BODY 


i Male students — Female students 
Major who have designated majors | who have designated majors 
Total Hypnotic Total Hypnotic 
undergraduate sample undergraduate sample 
(%) (%) (%) (%) 
Humanities and Sciences 
Humanities 8 15 
Social sciences 35 36 2 p 
Physical and biological sciences 17 11 12 2 
Engineering and Mineral Sciences 38 25 1 0 
Professional and Preprofessional 2 13 16 24 
Total 100 100 100 100 
Number of cases 2,359 55 1,109 55 
Major not chosen 903 9 568 5 
Grand Total 3,262 64 1,677 60 


we have a reasonable cross section of stu- 
dent interest represented (Table 8). 

. We may assume that from the point of 
view of demographic characteristics our 
sample is fairly representative of the Stan- 
ford undergraduate. The question remains 
whether or not any subtle bias enters 
through the process of volunteering for an 
experiment involving hypnosis. We have 
attempted to get an index to this bias by 
having members of the class volunteer first 
for a session in which they have an oppor- 
tunity to complete a personality inventory, 
and then we have limited the group accept- 
able for hypnosis to those who first volun- 
teered for the personality inventory. If 
there are personality factors leading some 
kinds of people to volunteer for hypnosis, 
and others to refrain from volunteering, We 
Should be able to detect some differences in 
their scores on the personality inventory. 


There were limited opportunities to partici- 


pate in the hypnotic experiments, and many 
of those listed as nonvolunteers would 
gladly have participated. Hence it must not 
be assumed that any large fraction of the 
nonvolunteers had qualms about hypnosis. 
Some illustrative results are given in Table 9. 

The means and standard deviations come 
Very near to the norms as published for the 


two scales reported (Gough, 1957). For 
example, for a sample of 680 college stu- 
dents the Dominance scale is reported to 
yield a mean of 28.5 with a standard devia- 
tion of 6.0, and the Self-Control scale a 
mean of 29.2 with a standard deviation of 
7.1. For our subjects the means for Domi- 
nance and Self-Control are 29.2 (SD = 6.1) 
and 27.4 (SD = 6.5), respectively. 

The only evidence of any bias is in the 
somewhat lower scores on the Self-Control 
scale of our hypnotic subjects as against the 
nonvolunteer sample. The difference ap- 
proaches significance for the female sub- 
jects (р = .05), and lies in the same direc- 
tion for the male subjects, although the 
difference for them is not significant. Be- 
cause within this sample the Self-Control 
scale correlates r = —39 with our 2-day 
hypnotic scores for women students, any 
bias introduced would tend to raise the level 
of susceptibility of our women subjects 
against that of the total student sample. 

In summary, the sample represents à 
cross section of the Stanford undergraduate 
student body with respect to choice of 
major. It is possible that there is some 
slight bias in the volunteer sample; if there 
is, it overrepresents slightly the susceptible 


women students. 


12 
DISTRIBUTION OF SUSCEPTIBILITY 


Having established that the Stanford 
Hypnotic Scale yields data that show both 
reliability апа validity, and that the sample 
is at least а moderately satisfactory one to 
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represent Stanford undergraduates, we are 
prepared now to turn to the more detailed 
findings with respect to individual differ- 
ences in scores. 

The score distributions for the 2 days 
separately, and for the 2 days combined, are 


TABLE 9 


SAMPLE VOLUNTEERING FOR HyPNosiS COMPARED WITH SAMPLE Not VOLUNTEERING 
AMONG THOSE Мно Тоок PERSONALITY TEST* 


Male subjects Female sub’ acts 
Sample Dominance Self-Control Dominance Self-Control 
(Do) (Sc) (Do) (Sc) 
N M c M c N M c M c 
Volunteered for hypnosis 62 29.8 6.3 | 27.1 6.7 | 59 28.7 6.0 27.7 6.3 
Did not volunteer for hypnosis 92 29.2 5.3 | 28.2 7.214 -28.5. 5.5 | 30.1 99 
Differences between means 
(Hyp. — nonhyp.) 0.6 —1.1 0.2 —2.4 
Critical ratio 0.6 1.0 0.2 2.0* 
a California Psychological Inventory. Palo Alto, Calif.: Consulting Psychologists Press. 
*p <.05. 
ТАВГЕ 10 
Raw Score DISTRIBUTIONS: ALL SUBJECTS, Born Days SINGLY AND COMBINED 
Scores on single days Scores on 2 days combined 
Score Day 1 Day 2 Score Day 1 + Day 2 
More susceptible 12 3 4 24 2 
11 6 7 22-23 3 
10 9,31 11,33 20-21 11 (28 
9 5 5 18-19 9 
8 8 6 16-17 3 
Less susceptible 7 11 8 14-15 10 
6 10 10 12-13 12 
5 18 19 10-11 16 
4 13 19 8-9 15 
3 12 9з 12 91 64 17 96 
2 13 11 4-5 11 
1 9 7, 2-3 8 
0 7 5 0-1 7 
N 124 124 124 
M 5.25 5.48 10.73 
c 3.22 3.20 6.14 
Day 1 vs. Day 2: Mean difference = .23 
орм = 17 
СЕ = 1.35 (р = .09, one-tailed test) 
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presented in Table 10, along with some 
statistical measures based on the distribu- 
tions. The change in scores between the 
first and second day is in the expected direc- 
tion of a slight mean increase but the differ- 
ence is not statistically significant. 

Because there were no sex differences 
(CR = 0.35) the sexes are not treated 
separately. This finding disagrees with that 
of our first year of experimentation (Hil- 
gard et al, 1958), when it was found that 
women were more susceptible than men. 
While a somewhat different procedure was 
used, we have been unable to determine any 
basis for the disagreement. 

When the 2-day scores are plotted as a 
frequency distribution (Figure 3) they yield 


*— Obtained scores 
— Filled curve J 
N=124 
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18 20 22 24 


Fic. 3. Two normal curves fitted to obtained 
data, 


a strikingly bimodal distribution. As an il- 
lustrative curve of best fit, the combination 
of two normal distributions has been super- 
imposed on the obtained values in Figure 3; 
These were obtained by computing means 
and standard deviations of the scores above 
and below the low point as though they 
Were separate distributions. Half the scores 
at the low point were assigned to the lower 
distribution, half to the upper. Then points 
on the normal curve were obtained, corre- 
Sponding to the scale values of the plotted 
Scores, and the two distributions added 
Where they overlapped. The curve about the 


first mode is based on a mean of 8.07, SD = 
4.01, N = 96.5, while the second mode is 
represented by a mean of 20.00, SD = 1.93, 
N = 27.5. Departure of the true points 
from the fitted curve is well within the 
limits of chance, as determined by a chi 
square test for goodness of fit (p = .95). 
That is, if the "true" population of scores 
is distributed according to the fitted curve, 
deviations of the size found would be ex- 
pected 95 out of 100 times for a sample of 
this size. 

A single normal curve can also be fitted 
to the data without doing much violence, 
because the second mode is a small one. The 
departure of the obtained scores from a best 
fit normal curve is not bad by the chi square 
criterion (p = .50), so that it is permissible 
to enter the scores into correlations. 

The bimodality of scores is consistent 
with the findings of the earlier investigation 
(Hilgard et al., 1958). Because bimodality 
is rather unusual in psychological investiga- 
tions, the bimodality of the distributions de- 
serves special consideration. 

The bimodality of score distributions, 
found in two successive yearly studies using 
somewhat different procedures, raises inter- 
esting questions. Does it mean that there is 
some sort of "type" distribution underlying 
susceptibility to hypnosis, or does it mean 
that some procedure in assembling test 
items and in weighting scores has produced 
the bimodality ? 


Problem of Bimodality 


We may begin with the assumption that 
the distribution of susceptibility is not in- 
herently bimodal, and show, under this as- 
sumption, how bimodality can arise. That 
bimodality is not inevitable has already been 
made clear in our earlier study in which, 
using the Friedlander-Sarbin weights the 
distribution was in the form of an inverted 
J, while by using dichotomized scores the 
‘distribution became bimodal (Hilgard et al., 
1958). This finding has led us to examine 
some of the things that happen when scores 
are dichotomized. The following principles 
are known to statisticians, but they are 
seldom of interest to psychologists because 
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they apply primarily to scales built of items 
that correlate higher than most psycholog- 
ical scores correlate. 


1. If dichotomized items of equal diff- 
culty are combined into a scale by simple 
addition, the higher the intercorrelations be- 
tween the items, the greater the bimodality 
of the resulting distribution. It is possible 
to begin with items that split exactly 50-50, 
so that there is no inherent tendency to 
prejudice the form of distribution, and then 
to combine them to form various kinds of 
symmetrical distribution, The higher their 
intercorrelations, the more evident the bi- 
modality of their composite scores will be 
(e.g., Guilford, 1950, p. 491). 


2. If highly intercorrelated dichotomized 
items are of unequal difficulty, the form of 
the distribution of scores based on adding 
the item scores will depend upon the distri- 
bution of the item difficulty. The simplest 
way to demonstrate this empirically is to 
arrange some artificial scores into Guttman- 
type scales, constructing the scale to yield 
the highest possible item intercorrelation for 
dichotomized items varying in difficulty. 
The bimodality that would result if the 
items were of-equal difficulty need not re- 
sult if the items are of sufficiently different 
difficulties. Some illustrations are given in 
Figure 4. It is easy to see how manipula- 
tion of item difficulty can produce any de- 
sired form of distribution. 

The demonstrations of Figure 4 show 
that with high correlations and some choice 
in item difficulty, almost anything can hap- 
pen to the form of score distribution. 
Therefore one must be very careful not to 
assert anything about the distribution of 
the phenomena underlying the scores unless 
more information is available than a distri- 
bution of scores added up from dichoto- 
mized items." 

Both the principles above apply to our 
data, for we have a number of highly inter- 
correlated items, and they vary in difficulty. 


з Eysenck and Furneaux (1945) in discussing 
the problem of score distribution consider the con- 
sequences of cutting points on otherwise normally 
distributed scores. Whenever scores аге dichoto- 
mized, some statistical problems arise. 
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Fic, 4. How distribution of item difficulty affects 
the form of distribution of scores based on highly 
intercorrelated items. 


By selecting items of appropriate difficulty 
(or by dichotomizing them somewhat dif- 
ferently) we could obtain almost any form 
of distribution. 

True bimodality may exist. Despite the 
caution that is needed in asserting anything 
about the "true" form of distribution, one 
item in our scale is measured by a "natural" 
kind of scale, so that the form of distribu- | 
tion can be determined for this item with а 
minimum of artifacts due to scaling. This 
is the item which tests recall under sug- 
gested amnesia. The form of the interroga- 
tory at the end of the hypnotic session per- 
mitted the subject to recall from 0 to 10 
items, and his raw score is simply the num- - 
ber of items recalled. There are no con- 
straints on this score that should make it 
bimodal, unless there is something about the _ 
distribution of amnesia that is inherently 
bimodal. The distribution is plotted in Fig- 
ure 5. The scores have been converted from 
items recalled to items forgotten, in order 
to make the form of distribution conform 
to that used in the distribution of suscepti- 
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Fic. 5. Distribution of amnesia scores obtained 
following each of 2 days of hypnotic induction. 
(N = 124) 


bility in Figure 3. The distribution is strik- 
- ingly bimodal, and the number of cases in 
- the upper mode (33 of 124) is not far from 
the 28 of 124 in the upper mode of the total 
susceptibility scale (Table 10). These re- 
sults keep alive the possibility that the true 
distribution of hypnotic susceptibility in our 
population is bimodal. 

The only way in which the choice of items 
could bias the form of the amnesia distribu- 
tion would be by way of some causal influ- 
ence upon recall of the items as a conse- 
quence of experience with them. The 
“correlation between other scale items 
and amnesia might conceivably result in a 
similarity in form between the distribution 
| of susceptibility and the distribution of 
| amnesia. In that case the resultant suscepti- 
bility is in reality bimodally distributed, 
but its distribution might be artificially 
- produced by the kinds of items used in the 
test. Actually we shall elsewhere report 
some analyses showing that recall is in fact 
affected by the nature of the items, but the 
- tendency is to have somewhat less amnesia 
for successful than for unsuccessful items, 
When depth of hypnosis is taken into ac- 
. Count (Hilgard & Hommel, 1961). Any 
. Similarity between the forms of distri- 
bution for susceptibility and for amnesia 
must therefore depend upon fairly complex 
. underlying processes. 
| We leave the problem of bimodality here. 
On the one hand we have noted conditions 


under which the form of the curve of dis- 
tribution can be artificially manipulated, 
which leads us to distrust the bimodality of 
our scores derived from summing dichoto- 
mized items. On the other hand, the pres- 
ence of bimodality in the amnesia scores 
suggests the possibility of some genuine bi- 
modality underlying the results. 


Problem of Dimensionality 


There have often been suggestions that 
there are different kinds of hypnotic sus- 
ceptibility, as reflected in the differential 
passing of unlike items. Thus White ( 1937) 
proposes that there are two kinds of hyp- 
notic trance (active and passive) with dif- 
ferent personality correlates. Eysenck and 
Furneaux (1945) propose a distinction be- 
tween primary and secondary suggestibility, 
as measured by different items; they also 
suggest a difference between active and pas- 
sive subjects. It is important to examine 
our criteria measures, therefore, to deter- 
mine if there is any evidence for more than 
one kind of ability underlying hypnotic 
susceptibility. 

It should be pointed out that hypnosis it- 
self is classified by Eysenck and Furneaux 
as illustrating "primary" suggestibility, and 
the tests that correlate positively with it are 
all tests of primary suggestibility ; second- 
ary suggestibility, which they sometimes re- 
fer to as "gullibility," does not correlate 
with hypnosis If we should fail to find 
more than one kind of trait making up sug- 
gestibility to hypnosis (among items predic- 
tive of hypnosis) we would not be contra- 
dicting their findings. 

Item. Correlation with Total Scale. One 
line of evidence suggesting that there is a 
common dimension running through all of 
the items of the scale is furnished by the 
relatively high correlation between the score 
on each item and the total score minus that 
item. Because each item is dichotomized, 
and the total score can be expressed along 
a scale, it is appropriate to изе biserial cor- 
relations (Table 11). For the scores of 


4A study recently completed in our laboratory 
(Moore, 1961) lends support to this distinction, 
See also Stukát (1958). 
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TABLE 11 


CONTRIBUTION OF EACH ITEM WITHIN THE TOTAL SCALE 
(N = 124 throughout) 


Correlation 
Percentage Reliability with total scale 
Item passing (Day 1 vs. Day 2) minus this item 
tetrachoric r’s biserial 7's 
Postural sway 69 .96 .38 
Eye closure 58 .78 51. 
Hand lowering 81 .83 .63 
Arm immobilization 14 +74 18 
Finger lock 32 .83 72 
Arm rigidity 32 .88 .83 
Moving hands 70 .75 291 
Verbal inhibition 23 .94 .79 
Hallucination 35 71 ‚55 
Eye catalepsy 30 .94 .79 
Posthypnotic suggestion 49 .60 .60 
Amnesia 32 A .69 


Day 1, as shown in the table, these correla- 
tions range from a low of .38 for postural 
sway, measured prior to induction, to a high 
of .83 for arm rigidity. These correlations 
indicate a high common factor running 
through the scale. 

The one rejected item (arm catalepsy) 
was discarded because it did not correlate 
with the rest. The item consisted of the pas- 
sive raising of the forearm with the elbow 
resting on the arm of the chair. It was 
scored as a pass if the arm remained in posi- 
tion rather than returning to the arm of the 
chair. While the 2-day retest reliability was 
satisfactory (у, = .92), the biserial correla- 
tion on Day 1 with total score minus that 
item was —.14. Thus in part through item 
selection, in part through the nature of the 
phenomena themselves, our scale is highly 
saturated with what may be designated 
"primary suggestibility." 

A Guttman-Type Scale. Guttman (1950) 
has proposed an arrangement of items in at- 
titude scales to determine whether or not 
there is a single dimension running through 
the items. His type of scale can be applied 
to other kinds of tests provided there are 
high intercorrelations among the items. If 
items yield dichotomized scores (as ours 
do), then all that is necessary is to arrange 


the items in order of descending difficulty 
(i.e., percentage passing) along one axis, 
and subjects in descending order of total 
scores along the other axis. Then a plot of 
success by item by subject will yield a tri- 
angular distribution, with the "plus" signs 
concentrating in the upper left of the dia- 
gram, and the minus signs in the lower 
right. A fraction of such a diagram, using 
actual data from the first 32 subjects of our 
study, is presented in Table 12. Ideally, no 
subject should have had any successes (+’s) 
to the right of the solid line. 

When items are arranged as in Table 12 
it is possible to compute the coefficient of 
reproducibility as a percentage of the items 
falling where they should fall if the scales 
were perfect, that is, if there were no blanks ` 
above the stepwise line in the table. This 
specimen table has a coefficient of repro- 
ducibility of .92, which satisfies the Gutt- 
man criterion of .90 for a satisfactory scale. 
Applying this method to all our data for 
124 subjects for the 24 items of 2 days of 
hypnosis we find a coefficient of reproduci- 
bility of .88, which, while slightly below the 
value designated as desirable by Guttman, 
comes close to it, and strongly supports the 
interpretation that the tests measure а uni 
dimensional trait. The Guttman scaling iS 


| 
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TABLE 12 


ILLUSTRATIONS OF GUTTMAN-TYPE SCALING FOR ITEMS FROM HYPNOTIC SUSCEPTIBILITY SCALE 


Subject 
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Note.—Item identification (in order of presentation): 
Postural sway 

Eye closure, 

Hand lowering 
Immobilization (Arm) 
Finger 10 

Arm rigidity 


Anpe 


imperfect, however, and it is possible that 
some second dimension is interfering with 
unidimensionality. 

Item Intercorrelations and Factor Anal- 
ysis. Dichotomized items have to be inter- 
correlated from fourfold tables. When we 
plotted our data we noted that there were 


7. Hand movement 
8. Verbal inhibition 
9, Hallucination 
10, Eye catalepsy 

11. Posthypnotic suggestion. 
12. Amnesia 


often empty cells, owing to the high inter- 
correlations and the extreme splits for some 
of the items. Because such tables yield in- 
determinate results, we decided to use 2-day 
scores, which fall on a three-point scale 
(iles s —+, ++). We then dichoto- 
mized anew in such a manner as to avoid 
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TABLE 13 


DISTRIBUTION OF [TEM DIFFICULTY 


Percentage of subjects 
passing item 
Item 
Days 1 
Dayi|Day2| and 2 
combined* 

Postural sway 69 81 67 
Eye closure 58 67 52 
Arm lowering 81 86 78 
Arm immobilization 14 16 22 
Finger lock 32 31 40 
Arm rigidity 32 | 26 36 
Moving hands 70 77 64 
Verbal inhibition 23 19 32 
Hallucination 35 34 46 
Eye catalepsy 30 31 35 
Posthypnoticsuggestion | 49 52 36 
Amnesia 32 27 39 


a Items dichotomized after combining 2 days, in order to 
bring cutting points nearer to 50-50. 


empty cells. The resulting distribution of 
item difficulty is given in Table 13, in which 
the original item difficulties are also shown. 
Because of the high intercorrelations be- 
tween the days it was not possible to make 
very drastic shifts in the cutting point (e.g., 
by accepting a single day's passing as 
enough for the more difficult items, and 
requiring passing on both days for the easier 
ones). The resulting fourfold tables of 
intercorrelations turned out to be without 
empty cells, however, so that the unambigu- 
ous use of tetrachoric correlations became 
possible. 

A table of intercorrelations was prepared, 
and the table factor analyzed by Thur- 
stone's centroid method. The intercorrela- 
tions, and the loadings on the first factor 
(unrotated) are given in Table 14. In order 
to make it easier to read the table, the items 
have been rearranged in the order of their 
loadings on this first factor. The diagonals 
represent the reliabilities, as determined by 
retest tetrachoric correlations, stepped up 
by the Spearman-Brown formula. 

The items in Table 14 have also been 
grouped as “challenge” and “other” items. 


It is of some interest that the five items of $ 
highest loadings, all intercorrelating from 
65 to .87, are of the so-called “challenge” 
type, in which the subject is unable to 
counteract the suggestions of the hypnotist, 
even when told to try. They are also the 
most difficult items, as shown in the earlier 
Table 13, with only two other items (am- 
nesia and posthypnotic suggestion) falling 
in their range of difficulty. In some sense 
amnesia belongs with them, because the sub- T 
ject is asked to recall after being told that | 
he will be unable to do so; this is a kind of 
challenge. It is not surprising, therefore, | 
that amnesia has the next position in the] 
table of intercorrelations. | 

The loadings on the first factor аге high ff 
enough to account for 5196 of the variance. | 
The second and third factors account for 
12% and 6% of the variance, respectively, 
so that the three factors account for 699% | 
of the variance. Further factor analyses аге | 
in process, in which items better representa- | 
tive of the subordinate factors are included. | 
]t is evident, however, that the test as scored 
reflects heavily the common factor. 

Thus the biserial correlations between) 
single items and the total scale minus that |. 
item, the Guttman scaling, and the factor T 
analysis, all point to the same conclusion. 
that there is essentially a single dimension 
running through these scores, that there 18) 
basically one kind of hypnotic susceptibility] 
being measured, although the slight differ- 
ences in items undoubtedly contribute some- 
thing—if nothing else, a practical method) 
of deriving scores covering the whole rang e|. 
of susceptibility. The results have led us taff 
retain the uniform weighting of all items} 
entering into our scale. Because they have 
not been adjusted for variance and соүай 
ance, they are not equally weighted, but} 
further adjustment appeared to be an um) 
necessary refinement. 

How do these results square with what) 
others have found? The conclusion of Еу 
senck and Furneaux (1945) that there were 
two kinds of suggestibility was cited earlie "n 
It was pointed out, however, that their] 
“secondary” suggestibility did not correlatt | 
with hypnosis. Because we eliminated ай} 
items not correlating with hypnosis, it is n0! 
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TABLE 14 
IrEM INTERCORRELATIONs: Day 1 AND Dav 2 COMBINED 
(N — 124) 

1. Challenge items IL. Other items Loading 
on 
first 

Item 6 5 4 10 8 290172 3 9 ‘Cae S ages | factor 
1. Challenge items 
6. Arm rigidity [9a] „87 .82 .80 .75 | .63 .54 .59 .48 .29 .48 .22 86 
5. Finger lock [92] .65 .78 .90 | .78 .58 .44 .48 .30 .22 .30 .85 
4. Arm immobility [4] .78 .70 | .60 .55 .38 .49 .18 .61 .10 78 
10. Eye catalepsy 97] .16 | .32 .51 .46 .46 .40 .22 .35 78 
8. Verbal inhibition [97] | .58 .42 .42 .39 .29 .21 .238 | 117 
Il. Other items 
12, Amnesia [84] 50 .51 .66 .29 .32 .42 | .74 
2. Eye closure 46 .55 .50 .35 .42 70 
3, Hand lowering [89] .36 .71 .48 .40 | .69 
9. Hallucination 50.54.37 69 
T. Moving hands 58 .55 62 
11. Posthypnotic Е -30 59 
1. Postural sway 96 4 


Surprising that we did not find secondary 
Suggestibility. Our results are therefore not 
in conflict with theirs, for our findings do 
not bear on the concept of secondary sug- 
gestibility. 

Although we were alerted for evidence 
related to his distinction, we found nothing 
that permitted us to classify most subjects 
according to White's (1937) distinction be- 
tween active and passive subjects. 


Das (1958) has recently reported a factor 
analytical study of a scale of hynotic depth 
and reports finding a strong general factor 
and a second much weaker factor account- 
ing for most hypnotic suggestibility. He 
used very few subjects, however, and such 
results as he found must be rather unstable. 


USES or A SUSCEPTIBILITY SCALE 


In validating our scale as one measuring 
susceptibility we have attempted to show 
that it selects those who, when given fur- 
ther opportunities to experience hypnosis, 
are more likely to show the more varied 
phenomena associated with an established 
trance, in contrast to those who with further 
opportunities are less likely to show the 
phenomena of hypnosis. The distinction is 
in this respect one familiar in psychology 
between aptitude tests and achievement 
tests: an aptitude test predicts what can 
happen with further experience, while an 
achievement test indicates what profiting 
there has already been from experience. 
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At the same time, we are making use of 
the scores on our scale as criterion scores in 
the studying of correlates of hypnotic sus- 
ceptibility. This uses the test scores as sam- 
ples of hypnotic performance. It is entirely 
permissible to use the same set of scores as 
a criterion for one purpose and as them- 
selves predictive scores for other purposes. 
Thus we use college grades as criteria by 
which to validate scholastic aptitude tests, 
but we also use college grades as predictive 
of graduate work. 

To the extent that a susceptibility scale 
predicts those who are good candidates for 
further hypnosis, it is useful in discovering 
promising subjects. This is, however, by no 
means its only use. Another use is finding 
what degree of hypnotic susceptibility is 
needed for certain other purposes, as in the 
use of hypnosis in dentistry, obstetrics, or 
psychotherapy. Another use is to find how 
susceptibility is modified by changes in mo- 
tivation such as occur in confronting child- 
birth, surgery, the pain of burns, and so on. 
For studies in which changes in suscepti- 
bility are to be investigated, the availability 
of equivalent alternate forms (Forms A and 
B) will prove to be of considerable service. 

Other kinds of scales can be developed 
and are in process of construction in this 
laboratory. 

We have thus presented our susceptibility 
scale as but one kind of measure of hypnotic 
ability. Its primary purpose is to find out 
whether or not a given subject is likely to 
achieve a satisfactory trance with further 
hypnotic experience. The evidence at hand 
suggests that it is a fairly satisfactory meas- 
ure for this purpose. It will prove to be of 
greater service when more norms are avail- 
able from populations other than university 
students. To the extent that many experi- 
ments are done with university students, 
even the limited norms are of value. 


SuMMARY 


1. Analyses were made of the responses 
of 124 college students (64 men and 60 


5Part II, as described earlier in this report, 
is undergoing extensive revision. Another kind of 
scale has been developed for special purposes 
(Weitzenhoffer & Sjoberg, 1961). 


women) to a newly developed scale of hyp- 
notic susceptibility known as the Stanford 
Hypnotic Susceptibility Scale. The scale 
has been prepared in two highly similar 
forms (Form A and Form B), each yield- 
ing scores ranging from 0 to 12. АП sub- 
jects were scored on both forms, one score 
being obtained on each of 2 days of hypnotic 
induction. 

2. For the purposes of these analyses 
hypnotic susceptibility is defined as the 
number of responses representative of hyp- 
nosis yielded within the standard procedures 
of attempted induction and testing. As a 
sample of hypnotic phenomena the scale 
provides a criterion for personality studies; 
as an aptitude test it predicts the capacity to 
go on for more varied and complex hypnotic 
experiences. 

3. The historical reports on susceptibility 
usually have referred to the greatest depth 
of hypnotic trance achieved under various 
methods of induction, and with repeated 
sessions. 


While there is much variation, a summary 
of nineteenth century studies indicated a 
mean result of 9% refractory (nonsuscepti- 
ble), 2996 reaching a drowsy-light state, 
36% moderately hypnotizable (“Һурпо- 
taxy”), and 26% reaching a deep or som- 
nambulistic trance. A similar summary of 
investigations since 1930 indicated 22% re- 
fractory, 42% drowsy-light, 26% moderate, 
and 10% deep or somnambulistic. This 
order of disagreement does not seem at all 
surprising in view of the many uncertainties 
in this kind of attempted quantitative com- 
parison. 

Our own results, if divided in this fash- 
ion, would yield about 17% refractory, 35% 
drowsy-light, 25% moderate, and 23% deep 
or somnambulistic, thus falling very much 
in line with earlier studies. 

4. The sample studied was compared with 
the total student body at Stanford, and with 
a larger sample of the introductory psy- 
chology course from which our subjects 
were drawn. The sample is moderately well 


representative of Stanford undergraduates. - 
5. The distribution of susceptibility turns - 


out to be bimodal in 1958-59 as it was Їй 


Ў 
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_ 1957-58. A special study was made of arti- 
facts that could lead to bimodality. These 
include: (a) high item intercorrelation for 

‘dichotomized items, and (b) a distribution 

of item difficulty that includes several items 

of nearly equal difficulty. Because both of 
these artifacts are present in our data, the 

“bimodality must be viewed with suspicion. 

| The presence of bimodality in the amnesia 

‚ scores, however, cannot be attributed to 

these artifacts, and so leaves open the ques- 
tion of a “genuine” bimodality underlying 
these data. 

6. Various analyses of item intercorrela- 
tions lead to the belief that the scale is es- 
sentially unidimensional. The individual 
items show high biserial 7’s with the total 


score less that item. The Guttman scale is 
close to meeting the standard of 90% repro- 
ducibility; a factor analysis shows high 
saturation with a common factor. 


7. Retest reliability of .83 and Kuder- 
Richardson reliability of .91 for the 2-day 
scores mean that the scales are satisfactory 
in establishing criteria for the studies of 
personality correlates that are now in prog- 
ress. Some evidence of validity is provided 
by efforts to hypnotize by other methods 
selected subjects from the sample. The low 
scoring subjects proved refractory by other 
methods; the highest scoring subjects 
proved most able to go on to further hyp- 
notic experiences. 
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I. psychophysiology of emotion has in- 
t trigued American psychologists at least 
since James' pronouncements on the subject, 
and recently the psychological reaction to 
threat or stress has become an important 
part of the research endeavor of contempo- 
rary psychology. Surprisingly little work 
has been done, however, on the empirical 
relation between physiological and psycho- 
logical indices of reaction to threat. We do 
have extensive studies of visceral response 
to psychological stress (eg. Lacey, Bate- 
man, & Van Lehn, 1953; Wenger, Engel, & 
Clemens, 1957), of the effects of stress on 
performance (cf. Lazarus, Deese, & Osler, 
1952), of the effects of threat on verbal dis- 
organization (cf. Rapaport, Gill, & Schafer, 
1945), of the relation between general emo- 
tional disturbance and physiology ( Alt- 
Schule, 1953), and of the relation between 
self-rating scales and physiology (e.g 
Raphelson, 1957). Considering, however, 
the relative emphasis on both physiology 
and verbal behavior in the current literature 
there is a surprising paucity of material re- 
lating indices of verbal disturbance elicited 
by threat to concomitant physiological 
Changes. 

When such investigations have been un- 
dertaken they have been hampered by the 
lack of a standardized method for eliciting 
complex verbal behavior оп the one hand 


. and by an almost exclusive reliance on the 
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ubiquitous galvanic skin response on the 
other. For example, Hsü (1952) in the 
course of an extensive factor analysis of 
both PGR and verbal evaluative reactions to 
“emotional” words concluded that the two 
approaches "gave rise to collaborating, but 
not identical, results." The correlations be- 
tween verbal responses and PGR responses 
on the specific stimuli reported vary from 
16 to .46; the correlation matrix of four 
factors derived from the PGR data and two 
factors derived from the rating data shows 
correlation coefficients varying from .15 to 
.38. More recently Blum (1960) —using the “ 
Blacky test—demonstrated another relation 
between verbal and physiological indices of 
anxiety. A factor analysis of variables indi- 
cating anxiety potential demonstrated а 
primary factor with loadings on absent or 
minimal verbal output and on resistance 
drops during periods of no verbal output. 

Tt should be noted that in the Hsü study 
the verbal data called for ratings by the 
subjects of degree of emotional disturbance 
elicited by the stimulus words—a procedure 
frequently used also in earlier studies of the 
PGR. Such a procedure requires the sub- 
ject to rate his own emotionality ; it does 
not measure emotional behavior. The major 
question to which we want to address our- 
selves here is the relation between physio- 
logical and verbal indices of disturbance, 
anxiety, ог emotionality, rather than sub- 
jects’ evaluations of, or reactions to, their 
own emotional states. 

In order to arouse emotional disturbance 
we have made use of a new research instru- 
ment developed by Heath (1960). The 
Phrase Association Test (PT) presents the 
subject with a series of phrases dealing with 
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a variety of conflictual material. The sub- 
ject is instructed to respond to these phrases 
with the first phrase or association that 
comes to mind ( Heath, 1960). 

Although an offspring of the Word Association 

and Sentence Completion Tests, the PT attempts 
to avoid the limitations of these and other unstruc- 
tured tests by its systematic use of replicated 
highly structured stimulus phrases, minimal in- 
structional and structural constraints on the asso- 
ciative response, and an economical, objective, and 
quantitative scoring system for measuring behav- 
ioral indices of defensive behavior (p. 166). 
(Quoted by permission of the Journal Press) 
It permits the evaluation of verbal activity 
in respect to specific areas of conflict— 
corresponding to the content of the phrases. 
The use of phrases, rather than words, nar- 
rows the latitude of definition and meaning 
of the stimulus and focuses arousal into 
more specific and more easily analyzable 
areas. While restricting and controlling the 
stimulus material, the test, by requiring sub- 
jects to respond in phrases rather than in 
single associations, broadens the kind of be- 
havior elicited and permits wider areas of 
analysis of verbal defensive behavior. Thus, 
Heath has developed a 22-item checklist for 
scoring subjects’ verbal productions in re- 
sponse to the PT. We have used in the 
present investigation a 29-item list (cf. Ap- 
pendix C) divided into five areas of re- 
sponse modes. This scoring system concen- 
trates on cognitive defensive activity and is 
primarily focused on verbal behavior. While 
the analysis of verbal behavior is clearly 
applicable to some of the classical defense 
mechanisms (such as denial and intellectual- 
ization), it fails to deal with others such as 
repression, suppression, and reaction forma- 
tion. However, within the area of verbal 
reactions to threat it has a range of inclu- 
sion which, while narrower than the usual 
classifications, is directly appropriate to the 
material obtained from our subjects and 
may, in fact, have wider applications to 
other kinds of verbal behavior elicited under 
different conditions. 

At the physiological level we have avoided 
a restriction to a single measure of response. 
Rather we have collected, concomitantly to 
the subjects' verbal response to the stimulus 
phrases, data on heart rate, skin tempera- 


ture, peripheral blood flow, as well as the 
galvanic skin response. 

Two separate studies were conducted. In 
Study I we were concerned with the relation 
between verbal and physiological indices of 
response to the PT. Upon completion of 
this study it was decided to revise the list 
of phrases and to undertake a separate 
replication of the verbal behavioral findings 
of the first investigation. Consequently, 
Study II used an amended and expanded 
PT and no physiological measurements. A 
group Rorschach was also given to these 
subjects. 

Before stating the specific questions to 
which we want to address the investigation 
one reminder is appropriate. We have 
pointed out previously (Mandler, 1959) 
that psychological studies of individual vari- 
ation and of stimulus variation are logically 
disparate efforts. Thus, questions about the 
relation among variables when values on 
these variables are associated with indi- 
vidual subjects are different from questions 
about covariation when values are asso- 
ciated with stimuli or situations. In our dis- 
cussion, therefore, these two approaches 
will be kept distinct. 

While our primary concern in the present 
investigation will be with the description of 
significant variations in subject and stimulus 
differences on several indices of response 
to threat, some of the specific questions to 
be asked can be outlined briefly : 


1. The verbal response to threat— 


a. How do subjects differ in degree of 
response to various areas of threat, in mode 
of response, and in the relation between 
these two? Do subjects respond consistently 
from one area of threat to another or is 
there a differential use of response modes 
depending on the stimulus material? 


b. Can stimuli be ordered meaningfully 
according to the degree to which they elicit 
signs of verbal disturbance? Do specific 
types of stimuli tend to elicit specific types 
of response modes ? 

C. Is the response to threat in the 
Phrase Association Test related to the pet- 
ception of threatening material on the 
Rorschach? 
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2. The physiological response to threat 
and its relation to verbal indices— 

a. What is the relation between sub- 
jects’ perception of visceral activity (auto- 
nomic feedback) and their performance on 
the Phrase Association Test? 

b. What is the relation among three 
general indices of anxiety: Physiological 
activity, self-report of anxiety, and verbal 
disturbance? 

c. What is the relation among indi- 
vidual differences in degree of physiological 
activity, sensitivity to particular threat 
areas, and preferential use of specific re- 
sponse modes? 

d. Do stimuli differ significantly in the 
degree to which they elicit physiological 
arousal? How is this function of stimuli 
related to their tendency to elicit differential 
verbal disturbance? 


METHOD 


This investigation consists of two studies. 
In Study I, 32 subjects were presented with 
18 phrases and their verbal and physiolog- 
ical reactions were recorded concomitantly. 
In addition, these subjects were given paper 
and pencil self-rating scales of anxiety and 
visceral perception. In Study II, 28 sub- 
jects were presented with 40 phrases and 
their verbal reactions were recorded. These 
subjects were also given a group Rorschach 
test. 


The Phrase Association Test (PT) 


The general rationale for the PT has been de- 
scribed above; it only remains to describe in detail 
the particular stimuli and scoring systems used in 
our studies. 

In Study I, 18 phrases adapted from Heath 
(1956) were used; they cover four content areas : 
Neutral, Aggression, Sex, and Dependency. There 
were 6 neutral phrases and 4 in each of the other 
areas. These phrases are shown in Appendix A. 
It will be noted that the threat phrases deal with 
a variety of conflictual materials in these areas and 
use both humans and animals as main characters. 
The neutral phrases were designed to elicit no 
сопйїсї in themselves; they do permit an evalua- 
tion of the subjects’ general level of disorganiza- 
tion in our particular situation. 

. In Study П, a total of 40 phrases were used, 8 
in each of the following areas: Neutral, Sex, Ag- 


gression, Dependency, and Competition. Only 9 of 
the phrases were identical with those used in 
Study I and the new phrases were designed with 
special reference to the student population used in 
our studies. Thus, the addition of the Competition 
category was prompted by the belief that this 
might be a significant area of conflict or threat for 
Harvard college students. The phrases are shown 
in Appendix B. 

In developing a scoring system for the subjects’ 
verbal productions we were guided by two consid- 
erations: first, we wanted to make use of the 
scoring system which Heath had shown to be use- 
ful for the evaluation of conflict areas in a hospital 
population (Heath, 1956) ; and second, we wanted 
to expand it to be most sensitive to nuances of 
verbal behavior and to be theoretically meaningful 
in terms of possible reactions to threat. The final 
checklist (see Appendix C) contained 29 items 
divided into five modes of response: four of these 
modes vary along the dimension of degree of ac- 
ceptance of or involvement with the stimulus mate- 
rial; the fifth is an index of behavioral interfer- 
ence. 

Stimulus Avoidance. The most aloof type of 
response the subject can make to the stimulus 
phrases consists of an attempt to avoid the task 
and stimulus entirely. He can refuse to see it; he 
can try to leave the field. In terms of verbal reac- 
tions he can give no response, merely repeat the 
phrase, discuss the experimental equipment, and so 
forth. 

Recoding or Denial. In a manner somewhat 
similar to the physical avoidance of a stimulus, а 
subject can psychologically avoid it by a number 
of means. He can accept the stimulus qua stimulus 
material but then proceed to change its meaning in 
a variety of ways, or he can deny the validity or 
implications of its content. 1п terms of verbal re- 
sponse he can misinterpret the meaning of the 
phrase, deny its truth, evade the main theme, and 
so forth. In contrast to the Avoidance category 
the subject does react to the stimulus, but he does 
not explicitly accept its meaning. 

Rationalization, Neutralization, and Intellectual- 
ization. Further along the dimension of involve- 
ment a subject can accept the meaning of a stim- 
ulus while at the same time handling it in an im- 
personal fashion. Verbal responses in this category 
include normative statements which imply that the 
content of the phrase is not unusual, justifications 
of the meaning of the phrase by the elaboration of 
causes and motives, and so forth. 

Personalization (personal involvement), At the 
extreme of involvement with the stimulus material 
the subject can respond to the phrase without any 
attempt to alter its meaning and by referring the 
content to his own experiences and value systems. 
In terms of verbal behavior he refers to himself 
jn his response, elaborates the content in terms of 
value judgments, and so forth. 

Interference. This measure of behavioral inter- 
ference includes indices which point to a general 
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breakdown in the subject's ability to handle the 
stimulus material. It includes stuttering, laughter, 
long reaction times, and so forth. 

In Study I the subjects' responses were tape re- 
corded, subsequently transcribed, and scored. In 
Study II the responses were directly recorded by 
the experimenter and then scored from the proto- 
cols. In scoring these verbal productions, each 
response received a score equal to the sum of the 
appropriate items checked on the 29-item list. 

"There are two possible approaches to the problem 
of reliability: one would show a reliability co- 
efficient computed for the total scores given by 
two judges to a number o£ protocols, the other in- 
vestigates the agreement between the two judges 
on the actual signs checked for a number of proto- 
cols. We chose the latter, more stringent, criterion 
for our present investigation and computed per- 
centage agreement for two scorers on all protocols 
according to the formula: twice the number of 
agreements divided by the sum of all indices 
checked by the two scorers. The results were: in 
Study I, 77%; in Study II, 75%. This level of 
agreement is consistent with Heath's (1960) fig- 
игез of 6896 and 77% for two similar reliability 
studies. Heath also reports reliability coefficients 
on the same data of .94 and .95 for total PT 
Scores, 

In summary, this scoring method provided us 
with scores for each subject for each area of 
threat, for each mode of response across all stimuli 
used, and a total disturbance score. 


Physiological Measures 


Physiological recording was accomplished with a 
modified Grass six-channel polygraph described in 
a previous article (Mandler, Mandler, & Uviller, 
1958). Temperature and humidity in the experi- 
mental room were controlled; the mean tempera- 
ture for the 32 sessions in Study I was 72.2°F. 
(SD — 2.1), the mean relative humidity was 4076 
(SD = 6). Measures were taken on heart rate 
(with a Grass cardiotachometer), peripheral blood 
flow (by means of a Waters oximeter), finger 
temperature (by means of a Yellow Springs tele- 
thermometer), and palmar galvanic skin response 
(by means of a Yellow Springs dermohmmeter). 
Within each of these channels the following indices 
were obtained for each subject : 


Heart Rate: 


1. The mean of the five fastest beats during the 
15 seconds after the onset of stimulus was com- 
puted.. Of the 18 mean values for each subject, the 
highest value was corrected for heart rate base 
level during the minute preceding the start of 
stimulus presentation by the method suggested by 
Lacey (1956). The resultant autonomic lability 
score (ALS) is the first heart rate measure 
(EKG). 

2, The highest raw value determined as above is 
our second heart rate measure (EKG,), 


3. The mean five fastest beats during the 15 
seconds before and the 15 seconds after the onset 
of each stimulus were computed. For each subject 
the mean value for all 18 stimuli was computed 
and the difference between these two values is the 
third heart rate measure (EKG;). 


Galvanic Skin Response: 

1. An ALS score was computed similar to the 
first heart rate measure, representing the highest 
conductance value for each subject regardless of 
which stimulus elicited it and corrected for base 
conductance prior to stimulus presentation (GSR). 

2. For each stimulus we computed the difference 
in log conductance between the level af stimulus 
onset and the highest level obtained after stimulus 
onset. For each subject we computed the mean of 
these 18 values (GSR:). 


Peripheral Blood Flow: 

1. For each stimulus presentation we counted the 
number of discriminable changes in direction dur- 
ing the 30-second presentation of the stimulus. For 
each subject the measure is the mean of these 18 
values (ВЕ,). 

2. For each subject we measured the largest in- 
crease in blood flow for each stimulus and com- 
puted the mean for each subject (BF:). 

3. For each subject we measured the largest 
mean drop in this index similar to BF; (ВЕ). 


Finger Temperature: 

1. For each subject we measured the size of the 
largest continuous rise in finger temperature for 
each stimulus. Our first temperature measure 18 
the mean of these values (Т). 

‚ 2. Similar to Tı we measured the largest drop 
in finger temperature (T:). 

3. Т, ог Т, whichever was larger for that par- 

ticular subject (T). 


For each stimulus phrase we computed the fol- 
lowing physiological measures: 

Heart Rate: 

l. The difference between the means (for all 
subjects) of the five fastest heart beats 15 seconds 
before and 15 seconds after stimulus onset 
(EKGs). 

2. Similar to EKGis except that the five fastest 
beats between 16 seconds and 30 seconds after on- 
set were used (ЕКС.в). 


Galvanic Skin Response: 

l. The mean change in conductance as deter- 
mined for GSR: above, but averaged across sub- 
jects (GSRis). 

Peripheral Blood Flow: 

1. The mean number of changes in direction (aS 
computed for ВЕ,) for each stimulus (BFis)- 

2. Similar to BF: Mean value of the largest 
rise in this index (BF:s), 

3. Similar to BF, The mean value of the largest 
drop in this index (BFis). 
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Finger Temperature: 


1. Similar to Ti. The mean of the largest in- 
creases in temperature across subjects (Tis). 


2. Similar to Тг. The mean of the largest de- 
creases in temperature across subjects (T:s). 


Rating Scales, Interviews, and Group 
Rorschach 


The subjects in Study I, in addition to the РТ. 
and the physiological measures, were given the 
Autonomic Perception Questionnaire (APQ)? de- 
scribed elsewhere (Mandler, Mandler, & Uviller, 
1958) and the Manifest Anxiety scale (МА) de- 
veloped by Taylor (1953). The APQ measures 
the subject's general awareness of bodily and vis- 
ceral reactions during periods of stress or un- 
pleasure. Immediately following the presentation 
of the PT these subjects were also interviewed 
with the aid of a standard set of questions con- 
cerning their awareness of bodily and visceral 
processes in the course of the experiment. The 
responses to 10 items of this interview were scored 
on a 0, 1, and 2 scale ranging from no awareness to 
marked awareness of such reactions. The sum of 
these scores represents our Interview scale. Scores 
on this scale ranged from 1 to 14 with a mean of 
74 (SD — 31). 

The subjects in Study II were given a group 
Rorschach subsequent to the PT administration. 
They were given standard instructions; the Ror- 
schach cards were presented by means of a slide 
projector and subjects wrote their responses in an 
answer booklet. Each Rorschach response was 
scored for the presence or absence of latent or 
manifest imagery in each of the four conflict or 
threat areas used in Study II (see Appendix D for 
the scoring manual). The reliability of this scoring 
scheme was determined by percentage agreement 
iem two scorers as for the РТ; agreement was 

306. 


Subjects 


The subjects in both studies were Harvard Col- 
lege undergraduates who volunteered for the in- 
vestigation and were paid at the rate of $1.00 per 
hour. They were primarily freshmen and none of 
the subjects had any prior experience with experi- 
ments in the personality field. The mean age of the 


2 А copy of ће Autonomic Perception Question- 
naire and scoring instructions have been deposited 
with the American Documentation Institute. Order 
Document No. 6764 from ADI Auxiliary Publi- 
cations Project, Photoduplication Service, Library 
of Congress; Washington 25, D. C, remitting in 
advance $1.75 for microfilm or $2.50 for photo- 
copies. Make checks payable to: Chief, Photo- 
duplication Service, Library of Congress. 


abject in Study I was 180, in Study II it was 


Procedure 


The subjects in Study I originally signed up for 
a group testing session in which they were given 
the APQ and MA. They were then seen in indi- 
vidual sessions where the PT was administered. 
At the beginning of this session each subject was 
told that the study was concerned with the investi- 
gation of physiological reactions in response to a 
variety of different stimulus materials. After elec- 
trodes had been attached the subject was given an 
adaptation period of no less than 20 minutes, At 
the conclusion of that period he was given instruc- 
tions for the PT: 


I am going to show you a phrase or sentence 
projected on the wall in front of you. I want 
you to say the first phrase or sentence that comes 
to mind. Any phrase or sentence will do, but 
say the first phrase that comes to mind as quickly 
as you can. This is to see how quickly you can 
react, Between each presentation there will be a 
short interval. Be sure to give the first phrase 
that comes to mind and to speak clearly. 


The subject was also reassured that we were not 
interested in his personal reactions, but rather that 
the data collected would form part of a large re- 
search project. After 2 practice phrases the 18 
phrases were presented by means of a slide projector 
which exposed a phrase every 60 seconds, Following 
the subject’s response the stimulus phrase was re- 
moved and the subject faced a blank screen until 
the next phrase appeared. The order of phrases 
was randomized for each subject with the restric- 
tion that the first and last phrase was a neutral 
phrase and that the other phrases were presented 
in such a manner that each consecutive block of 
four phrases contained one phrase each from the 
four areas: Neutral, Aggression, Sex, and Depend- 
ency. In addition a blank slide was presented fol- 
lowing the first, ninth, and seventeenth phrase in 
order to check on the possibility of a conditioned 
physiological response to the projector noise and 
stimulus change. The procedure was interrupted 
only if the subject gave three consecutive one-word 
responses in which case he was prompted to re- 
spond with phrases. Following presentation of the 
PT each subject was given the interview described 
above. 

In Study II the procedure was identical except 
that no physiological measures were taken and no 
interview was given. A different experimenter ad- 
ministered this study. The randomization of 
phrases was similar to that in Study 1; however, 
with five phrase topics the restriction on blocks 
was changed accordingly. Following the PT ses- 
sion the subjects in Study II signed up for one of 
two group Rorschach sessions which were con- 
ducted ostensibly as part of a different research 
project and were given by a different investigator. 
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RESULTS AND DISCUSSION 


Individual Differences in Response to 
Threat Areas and in Modes of Response 


We are first interested in the relations 
among the threat areas and the use of the 
various response modes. As far as subjects’ 
responses to the various areas are con- 
cerned, we are asking whether a subject 
who scores relatively high in one area will 
also score relatively high in other areas. The 
relevant correlation matrix for both studies 
is presented in Table 1. It can be seen that 
all correlations are positive, indicating that 
the tendency to score high is general across 
areas. However, while all the correlations 
in Study I reach statistical levels of signifi- 
cance, this is not the case for Study II. 

The most striking finding concerns the 
relatively high correlations between Neutral 
and the other areas. Since Neutral phrases 
as a whole received much lower scores than 
the threat areas, (see below) this finding 
argues for a general disturbance factor in 
the subjects’ response to the task. Subjects 
who score high on the threat areas also 
show more anxiety to the neutral phrases, 
even though the latter do not generally tend 


TABLE 1 


PRODUCT-MOMENT CORRELATIONS AMONG AREAS 
For BOTH STUDIES 


Content | Aggres- Depend-| Compe- 
area sion Sex ency | tition 
Neutral; 
Study I .589*** | 481*** | .366* 
(N—32) 
Study II |.555*** |.336 .601*** | 355 
(N=28) 
Aggression: 
Study I .521*** | .392* 
Study II 457 461** |.311 
Sex: 
Study I 5615s 
Study II .128 .308 
Dependency: 
Study I 
Study II S70 
* 
us © Tus 
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to arouse a high degree of disturbance. 
Thus, it is likely that a subject's over-all 
anxiety level in response to the task may be 
more important in determining degree of 
response disorganization than specific anxi- 
ety reactions aroused by some particular 
subject matter. 

The remaining intercorrelations are of 
minor interest since they presumably only 
reflect differences in the phrases used in the 
two studies. There does seem to be less of 
a generality factor in Study II than in Study 
I; fewer of the intercorrelations reach sta- 
tistical significance. Thus, Sex shows no 
significant correlations with any other area, 
while Competition is significantly, and not 
surprisingly, related only to Dependency. 
Thus, in Study I there is less of a differ- 
ential response to various areas than in 
Study II. This greater differentiation among 
areas is borne out by the analysis of stim- 
ulus differences discussed below. 

Another approach to the problem of indi- 
vidual differences in response to these areas 
is to determine the number of subjects who 
"choose" each area as their area of major 
conflict or disturbance. Excluding several 
ties in both studies, i.e., cases where subjects 
received identical scores in more than one 
area, the distribution looks as follows: 


AGGRES- DEPEND- COMPE- 

NEUTRAL SION SEX  ENCY  TITION 
Study I 0 7 8 10 - 
Study II 0 Zé 9 3 2 


These data provide one validation index for 
the PT. No subject had his highest score in 
the Neutral area, thus arguing for the effect 
of threat content on the disturbance scores. 
Apart from the discrepancy on the Depend- 
ency dimension the studies distribute areas 
of conflict fairly evenly among subjects, and 
the differences between the two studies can 
be ascribed to differences in the phrases 
used, 

A more general analysis of the PT dis- 
turbance score and one which should not be 
affected by differences in phrase content be- 
tween the two studies concerns the relations 
among the five modes of response. The 


pertinent intercorrelations are shown in. 
Table 2. 


=з А 


'System we suggested 


RESPONSE TO THREAT 7 


TABLE 2 


Propuct-MoMENT CORRELATIONS AMONG RESPONSE 
Monks ron Вотн STUDIES 


Mode Ration- |. Per- 
of Re- | aliza- |sonali-| Inter- 
response coding| tion |zation | ference 
Avoidance: 
Study I :232| ,112 .215 | 442** 
(= 32) 
Study П —.030| .091 154} .419* 
(N 28) 
Recoding: 
Study I —A37** .177 | .004 
Study П —.428* | —.258 |—.178 
Rationalization: 
Study I —.027| .302 
Study II —.044| .182 
Personalization: 
Study I .285 
Study II .189 
М 
1% 


The agreement between the two studies is 
excellent. Of 10 correlations the same 2 are 
statistically significant in both studies and 
surprisingly similar in magnitude. This re- 
sult constitutes a partial reliability and 
validity check of the scoring system: two 
groups of subjects presented with different 
stimuli show similar relationships in the 
way they respond to or handle threat. р 

Тһе first significant correlation to note 15 
that between Avoidance and Interference. 
While Interference indicates some break- 
down in defensive and coping activity on 
the part of the subject, Avoidance is prob- 
ably the most pervasive and primitive type 
of defensive maneuver. It seems reasonable 
that subjects who show extreme disorgan- 
ization, as measured by the Interference 
category, will also tend to be most likely to 
avoid the task altogether and to try to es- 
cape from the requirements of the situation. 
A. subject's difficulty in handling the task 
can be expressed in both of these two 
modes. 

The other important correlation is the 
negative one between Recoding and Ration- 
alization. In our rationale for the scoring 
that while Recoding 
implies the denial of meaning of the phrase, 


Rationalization as a mode of response ac- 
cepts the meaning of the phrase but intel- 
lectualizes it. It appears from the present 
data that these two modes of response are 
actually alternative modes of handling mate- 
rial such as that used in these studies. A 
subject who uses one of them does not tend 
to use the other. We may be encountering 
here a personality difference related to such 
concepts as leveling and sharpening or vigi- 
lance and defense. p 

It should be noted that we had expected 
Personalization to be related to Recoding in 
a somewhat similar manner as Rationaliza- 
tion. Our data do not bear this out and 
while Personalization seems to be a mode 
of response independent of Recoding and 
Rationalization it is not alternative to either 
of them. Subjects may or may not use it 
and the other two modes independently. 

The next question we may ask is about 
the differential use of response modes. Do 
subjects tend to concentrate on one particu- 
lar kind of mode or defense or do they use 
these response types indiscriminately? 
First, we look at the distribution of the vari- 
ous response modes for the population at 
large to examine the proportion of disorgan- 
ization signs which fall into each category: 


AVOID- RE- RATIONAL- PERSONAL- INTER- 

ANCE CODING IZATION IZATION FERENCE 
StudyI 12% 27% 24% 21% 16% 
Study II 11% 23% 22% 27% 18% 


The agreement between the two studies is 
quite remarkable and what discrepancy 
there exists may be due to the elimination 
of phrases with animal content in Study II, 
phrases which may have decreased the num- 
ber of Personalization responses. If we use 
these figures as the average or expected use 
of these modes of response, we may now 
examine the concentrations of signs within 
particular modes for individual subjects. 
Looking at the percentage of disturbance 
signs falling into each subject’s most fre- 
quently used mode, we find in Study I a 
mean of 37% (range 24-76%), in Study II 
a mean of 40% (range 25-58%). Thus the 
mean concentration of signs, as well as the 
range, is quite markedly above the concen- 
tration for the population, i.e., the average 
subject. This indicates that different sub- 
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jects concentrate their responses in different 
response modes. However, no subject uses 
a single mode exclusively, the maximum 
concentrations being 76% and 58%, respec- 
tively. As a further step we established an 
arbitrary index of major concentration of 
response signs, i.e., a concentration of 40% 
or more (the population mean of concentra- 
tion) of all signs into a single response 
mode. Using this criterion we find that 7 
subjects in Study I and 12 subjects in 
Study II have a major response mode. In 
Study I, 4 of these subjects concentrated on 
Recoding and 3 оп Rationalization; in 
Study II, 4 subjects concentrated on Re- 
coding, 3 on Rationalization, and 5 on Per- 
sonalization. Thus a sizable number of 
subjects (3296 in both studies combined) 
concentrate on a single mode of response. 
It should be noted that none of these sub- 
jects shows any concentration in either 
Avoidance or Interference. This substanti- 
ates our initial assumption that these two 
categories should be differentiated from the 
other three which can more properly be 
called defensive modes. Avoidance and In- 
terference are indices of response to the 
task at large. 

The final question we may ask about in- 
dividual differences on the PT concerns the 
relation between threat areas and modes of 
response. An examination of the data on 
subjects’ concentration of response mode 
showed no apparent tendency for subjects 
to shift type of response according to the 
area being tapped. Another approach to the 
same question used the correlations between 
subjects’ scores on areas and their scores on 
the response mode categories. It was found 
that most of these correlations were signifi- 
cant because of the large amount of vari- 
ance being contributed by total PT scores. 
In order to control for this factor, partial 
correlations were computed between areas 
and modes holding total PT score constant. 
The resulting matrices for the two studies 
(of 20 partial correlations for Study I and 
25 for Study II) resulted in only a few 
significant relationships. In Study I the 
partial ~ between Aggression and Avoidance 
was .463 (p < .01), while the same correla- 
tion for Study II was .211. Similarly the 


partial correlation between Neutral and In- 
terference was —.420 (p < .02), for Study 
II —295 (p < .10); between Neutral and 
Personalization it was .435 (p < .02), for 
Study II —.060. Finally the partial corre- 
lation between Sex and Interference in 
Study II was .519 (p < .01), in Study I 
.160. Considering only those relationships 
which were confirmed in direction by the 
other study we can say that: (a) Subjects 
who are high on the Interference scale show 
little disturbance on the Neutral phrases. 
(b) Subjects who are high on the Avoid- 
ance mode also show more disturbance in 
the Aggression area. (с) Subjects who are 
high on the Interference scale are high in 
the Sex area. 

The last two findings suggest that sub- 
jects who react with more anxiety to the 
areas of Sex and Aggression also show 
more use of the response modes which indi- 
cate generalized disturbance. We shall have 
occasion to return to these findings in con- 
nection with some of the physiological 
indices. 


Stimulus Differences in Verbal Disturbance 


We may now ask how well the 18 stimuli 
are differentiated in terms of degree of dis- 
turbance and type of response which they 
evoke. 

Table 3 shows the mean number of signs 
associated with the various areas, and 
Table 4 presents the relevant analyses of 
variance. As а further validation of the PT, 
the Neutral phrases elicit fewer signs than 
do the threat areas. The analyses of vari- 
ance show a significant source of variance 
to be associated with areas in both studies. 
An analysis of the area means with Tukey's 
gap test permits the further statement that 
in Study I the Neutral phrases are signifi- 
cantly different from the threat phrases, but 
that no statistically significant differences 
exist among the threat groups. In Study П 
a finer distinction can be made since the 
Neutral phrases show a significantly low 
mean phrase score, followed by the Compe- 
tition and Dependency areas, which are in 
turn significantly lower than the Sex and 
Aggression areas. This substantiates the 
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TABLE 3 


MEAN NUMBER OF SIGNS ELICITED BY THE 
PHRASE AREAS IN Вотн STUDIES 


De- | Com- 
Study | Neu- |Aggres-| Sex | pend- | peti- 
tral sion ency tion 


I 33 58 57 64 
Il 22 38 37 30 30 


point made earlier that in Study II the vari- 
ous areas were better differentiated than in 
Study I. 

The question might be asked why the 
Neutral phrases show any disturbance signs 
at all. It should be remembered that the 
phrases were presented in a quasirandom 
order and that we have suggested general- 
ized anxiety reactions to the task as a whole. 
Furthermore, the signs described in our 
scoring manual cover a very broad range of 
possible responses with a great likelihood 
that even a truly "neutral" phrase will re- 
ceive scores larger than zero. This possi- 
bility has permitted us to estimate subjects’ 
habitual ways of responding to verbal 
stimuli, 

The mean differences among areas shown 
in Table 3 should again be ascribed to the 
different kinds of phrases used in the two 
studies. It is interesting to note though that 


the mean scores on Study II are generally 
lower than in Study I. Two possible ex- 
planations can be advanced for this phe- 
nomenon. The experimental situation in 
Study I, including elaborate physiological 
measurement equipment, was probably more 
anxiety arousing than the situation in Study 
IL Furthermore, the experimenter in 
Study II was himself an undergraduate, 
while the experimenter in Study I was much 
older than the subjects and probably more 
of an authority figure. 

One further set of analyses was con- 
ducted on other subdivisions of the phrases. 
These comparisons concerned phrases which 
made reference to father уз. mother, aggres- 
sion from others vs. aggression toward 
others, and homosexual vs. heterosexual 
content. The only consistent finding was 
that homosexual phrases received signifi- 
cantly higher scores than heterosexual 
phrases in Study IT with a similar but non- 
significant difference in Study I (combined 
р < .05 by Stouffer's test). 

We have previously discussed the rela- 
tions among different response modes as in- 
dividual difference measures. The question 
still remains whether phrases tend to show 
any covariance in their power to elicit par- 
ticular response modes. We have already 
seen in the section on individual differences 
that the modes differ in the percentage of 
signs assigned to them. This relationship is 
now substantiated in the significant source 


TABLE 4 
ANALYSES OF VARIANCE ON STIMULUS DIFFERENCES AND RESPONSE MODES 
Study I Study II 
Source 
df MS F df MS F 
Total 19 199 
B 35.6 39 11.9 
css (A) T 151.0 22.54 4 62.2 10.20 «* 
itin м $23 160 15:6 
Withi 5 : 
Modes (M) үп 156.0 6.32*** 4 155.0 10.76*** 
МХхА 12 21.1 <1.00 16 9.6 <1.00 
Error 48 24.7 120 14.4 


*** > < о. 
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of variance associated with modes in 
Table 4. However, when the various modes 
are correlated, using stimuli as instances, 
only one significant and consistent finding 
emerges: Recoding and Interference show 
a correlation of .335 (р < .05) for Study 
II and of 411 (p < .10) for Study I. 
Phrases which elicit a high degree of Inter- 
ference also elicit a high number of Recod- 
ing responses. Again it seems quite reason- 
able ex post facto that a phrase which elicits 
the most severe behavioral disturbance 
should also tend to elicit a quite primitive 
psychological defense, one which requires 
the least involvement on the part of the sub- 
ject. This relationship was further explored 
by an investigation of the mean verbal re- 
action times and mean lengths of responses 
associated with phrases. It seemed likely 
that phrases which are most highly threaten- 
ing, ie. evoke Interference and Recoding, 
would also show long reaction times and 
short responses. In fact, the correlations 
between Recoding and Interference and 
mean reaction time for phrases were all 
positive and statistically significant : 


REACTION TIME 


Study I Study II 
RECODING 574 (p < 02) .341 (p < .05) 
INTERFERENCE .688 (p < .01) „527 (p < 01) 


The correlations with Interference are 
partly due to the inclusion of reaction time 
measures in the Interference scores. 

The relationships with response length 
(in number of words) are less impressive, 
though generally in the predicted direction: 


RESPONSE LENGTH 


Study I Study П 
RECODING —.554 (р < 02) —.158 
INTERFERENCE .065 —210 


In neither study do we find any signifi- 
cant correlations between the other three 
response modes and reaction time and re- 
sponse length. 

These findings indicate a cluster of char- 
acteristics of stimulus phrases consisting of 
Interference and Recoding responses, long 
reaction times and, possibly, short re- 
sponses. 

The final question about the stimuli, 
whether different types of phrases tend to 


elicit different defenses, is succinctly an- 
swered by Table 4 which shows that in both 
studies the interaction between defense 
modes and phrase areas is not an important 
source of variance. Thus, any differential 
interaction between areas and response 
modes is an individual difference phenom- 
enon and is not associated with stimuli. 


Phrase Association Test and Group 


Rorschach 


Аз а further extension of the implications 
of PT scores the subjects in Study II were 
given a group Rorschach test. The major 
interest is in a comparison between the 
structured verbal PT and the more un- 
structured perceptual Rorschach. Our pri- 
mary attention was first to Rorschach re- 
sponses relevant to the four threat areas 
tapped by the PT, and second to the distinc- 
tion between manifest and latent imagery 
on the Rorschach and its relation to the 
Phrase Association Test. Table 5 shows the 
mean number of responses per subject and 
the percentage of total responses falling into 
the latent and manifest categories of the 
four conflict areas. While a sizable num- 
ber of the total Rorschach responses 
(36.895) fell into the four areas, the ma- 
jority are associated with the Sex category. 
Since the number of responses associated 
with Dependency and Competition was 50 
small, they were dropped from further anal- 
yses. The large number and percentage of 


TABLE 5 


Mean NUMBER or RORSCHACH RESPONSES AND 
PERCENTAGE OF TOTAL NUMBER OF 
RESPONSES FOR THE FOUR 
THREAT AREAS 


Content Manifest Latent Total 
area responses responses per- 

N (90) N(%)  |centage 

Sex 1.75 (4.3) | 9.14 (22.7) | 270 
Aggression 144 (3.1) | 1.86 (4.2) 1.3 
Dependency | 0.11 (0.2) | 0.21 (0.8) 1.0 
Competition | 0.18 (0.5) | 0.43 (1.0) I 


* 
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the Sex responses may well be a scorer 
characteristic since the wealth of literature 
on sexual imagery may make it difficult for 
the personality psychologist in mid-twentieth 
century not to “see” many sexual responses 
on the Rorschach. 

Our first question concerned the relation 
between the total PT score and some global 
Rorschach measures. However, total PT 
score is not related either to total number 
of Rorschach responses or total number of 
images (manifest and latent) in the threat 
areas (r = —.116 and .016, respectively). 
The latter relationship remains essentially 
unchanged when total number of Rorschach 
responses is held constant. 

The major hypothesis we investigated 
was that an indication of a high degree of 
anxiety in a particular area (derived from 
the PT) would be negatively correlated 
with the appearance of manifest imagery in 
that area on the Rorschach, but positively 
correlated with latent imagery. The reason- 
ing was that if obvious sexual or aggressive 
content is anxiety arousing in the context 
of the PT then the subject is less likely to 
use such imagery or responses on the Ror- 
schach. On the other hand, preoccupation 
with these areas would be expressed in 
latent imagery. Taking the Sex and Ag- 
gression areas separately the data are equi- 
vocal. For Aggression the correlation be- 
tween the PT Aggression scores and mani- 
fest aggressive imagery is —.153, with latent 
imagery it is .259; for Sex the two correla- 
tions are —387 (p « .05) and —.266, re- 
spectively. The relative relation between 
the two correlations is as predicted and be- 
comes more convincing when the two areas 
are combined. In that case total manifest 
sexual and aggressive imagery correlates 
—385 (p < .05) with combined Sex and 
Aggression PT scores and total latent imag- 
ery correlates .059 with the combined PT 
scores, When the total PT scores rather 
than just the Sex and Aggression scores are 
used the correlations are —.449 ($ < 92) 
and .105, respectively. Thus there is ground 
for accepting the hypothesis that anxiety as 
expressed in the PT is negatively related to 
the appearance of manifest threatening 
imagery on the Rorschach; however, it is 


unrelated to the appearance of latent imag- 
ery. 


PHYSIOLOGICAL MEASURES 


Autonomic Feedback 


One of the major interests in the present 
study concerns the relation of physiological 
indices of disturbance or anxiety to verbal, 
behavioral, and self-rating indices. In two 
previous studies (Mandler & Kremen, 1958; 
Mandler et al., 1958) both the Autonomic 
Perception Questionnaire (APQ) and a 
postexperimental interview were found to 
be positively related to physiological activity 
during an intellectual stress situation. Thus 
Mandler and Kremen (1958) found that 
APQ and total autonomic activity correlated 
224, Interview and autonomic activity cor- 
related .259, while APQ and the post- 
experimental interview correlated .391. 
When a combined APQ and Interview 
measure was used, the correlation with auto- 
nomic activity was .304. Thus there was a 
consistent, though low, relationship between 
subjects’ report of physiological activity and 
the actual level of that activity. 

The first relevant, and surprising, finding 
in the present study was the absence of a 
significant relationship between APQ and 
Interview (r = —.066). Even more unex- 
pected was a negative correlation of —.228 
between the APQ and a summary measure 
of physiological activity (see below for a 
detailed discussion of that measure). The 
Interview score based on the subjects’ re- 
port of physiological activity obtained im- 
mediately after the experimental session 
was positively related to the sum physiolog- 
ical measure (r = .317, p < .05). Thus, 
the major difference in the two studies lies 
in the failure of the APQ scores to relate 
either to the Interview or to the physiolog- 
ical activity during the experiment. We 
shall return to an investigation of this dis- 
crepancy after a brief review of the previ- 
ous findings obtained with the self-reporting 
scales. 

In the Mandler and Kremen study we 
found that reported perception of visceral 
activity (APQ) was, as expected, negatively 
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correlated with the subjects’ intellective 
functioning on a Vocabulary test (r = 
—.270). However, Vocabulary scores were 
unrelated to actual physiological activity. 
Thus perception of autonomic activity and 
actual autonomic activity seemed to act 
somewhat independently in their relation to 
intellective functioning. This finding led to 
the suggestion that rather than mere activ- 
ity and its report we might profitably look 
at a measure of subjects’ hypo- or hyper- 
sensitivity to internal bodily events. A meas- 
sure of Estimation was obtained which 
ranked subjects оп a discrepancy score be- 
tween their reported autonomic activity and 
actual activity. A high score on this scale 
indicates overestimation, ie., the subject 
reports more visceral activity than is indi- 
cated on his actual record—he might be 
said to be preoccupied with internal visceral 
events. Estimation and Vocabulary scores 
were, in fact, negatively correlated (r = 
—.304), and when actual autonomic activity 
is held constant by means of the partial cor- 
relation technique, this relation remains un- 
changed (r = —.317, p < .02). Thus, re- 
gardless of actual level of physiological 
activation there is a relation between tend- 
ency to overestimate such activity and low 
intellectual efficiency. 

In contrast to the previous study actual 
physiological activity was related to verbal 
performance (total PT score) in the present 
investigation. As expected, the summary 
meastire of physiological activity and the 
PT score were positively related to each 
other (r = .375, p < .05). However, there 
is no significant relationship between either 
the АРО or the Interview and total PT 
scores (> = —.242 and —.082, respectively). 
Tt will be recalled that the APQ and the 
sum physiological measure are unrelated in 
the present study, but that the Interview is 
positively related to visceral activity. There- 
fore, the latter was used to derive an Esti- 
mation scale for this investigation. For each 
subject we computed a difference score be- 
tween his standard score on the Interview 
scale and his standard score on the sum 
physiological measure ; thus a high score on 
this scale indicates overestimation, a low 
score underestimation of actual visceral ac- 


tivity. When this measure is related to PT 
performance the resultant r = —.380, i.e., 
subjects who overestimate tend to show less 
disturbance on the phrases than subjects 
who underestimate. However, when auto- 
nomic activity is held constant this correla- 
tion drops to a nonsignificant —.217; when 
the Interview score is partialed out the cor- 
relation rises to —463 (p < .01).° The 
Estimation-PT score relationship therefore 
seems to be relatively independent of Inter- 
view scores, i.e, independent of cognitive 
report. 

We can now contrast the discrepancies 
between the present investigation on an af- 
fective task and the Mandler and Kremen 
study using an intellective task. In the intel- 
lective task, Estimation is positively related 
to interference regardless of autonomic 
discharge; in the affective task, Estimation 
is negatively related to disturbance regard- 
less of level of cognitive awareness. Thus, 
subjects who are high on cognitive preoccu- 
pation with visceral events do poorly on an 
intellective task independently of their 
actual autonomic activity; the greater the 
cognitive preoccupation the more likely it is 
that a subject will be rated as overestimat- 
ing and the more poorly he will do on the 
task. On the other hand, subjects who show 
a high degree of visceral activity will more 
likely be rated as underestimators, and also 
show disturbance on an affective task. 

This differential importance of cognitive 
and visceral activity in the two kinds of 
tasks can be further illuminated by consid- 
ering the relation between the two report 
scales. While APQ and Interview were 
positively related in the intellective situa- 
tion, they are unrelated in the affective task. 
The kinds of things a subject says about his 
habitual awareness of visceral events ар- 
pear—for our college population—to be 
more directly related to intellective than to 


3 We might note that even though the APQ and 
physiological activity are unrelated, an Estimation 
scale based on the APQ as a measure of cognitive 
awareness produces results highly similar to those 
obtained with the interview. The relevant correla- 
tions are: Estimation vs. PT score —394; with 
physiological activity held constant, r = —.174, with 
APQ held constant r = —.339. 
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affective situations. This is further borne 
out by the lack of relation between APQ 
and the physiological measure. When sub- 
jects report habitual visceral activity they 
do not seem to use situations such as the 
Phrase Association Test as a reference 
point. When subjects are specifically ques- 
tioned about the situation, however—as in 
the postexperimental Interview—the rela- 
tions found for intellective tasks reappear. 
What is more important, however, is the 
apparent differential relevance of cognitive 
and visceral factors in the two situations. 

The other self-report scale, the Manifest 
Anxiety scale (MA), yielded relations to 
the report measures comparable with the 
Mandler and Kremen study. The correla- 
tion between APQ and MA is .262 (.267 in 
the previous study), between MA and Inter- 
view .298 (as against .199). MA and Esti- 
mation are positively correlated (r = .145) 
and this relation is statistically significant 
when actual visceral activity is held con- 
stant (r = .304, p < .05). The same find- 
ing was obtained in the Mandler and 
Kremen study (the two correlations were 
284 and .325, respectively) and supports 
the notion that "high anxiety-scale scores 
are related to the tendency to overestimate 
visceral discharge." We shall have occasion 
to refer to the relations among the Manifest 
Anxiety scale, the Phrase Association Test, 
and physiological measures below. 


General Physiological Activity and Verbal 
Anxiety Measures 


The summary measure of physiological 
activity referred to in the last section con- 


sisted, for each subject, of the sum of the 
standard scores on 9 of the 11 physiological 
measures described in the Method section. 
The two measures which were not included 
were the two GSR measures. The rationale 
for this exclusion was based on the follow- 
ing finding. 

In examining the correlation matrix for 
the 11 physiological measures (Appendix 
E) we noted that of the 55 correlation co- 
efficients 19 were negative. Twelve of these 
19 negative correlations were contributed 
by the two GSR measures. It seems reason- 
able to conclude from this finding that in 
computing a general measure of physiolog- 
ical activity these two measures should be 
excluded since they apparently indicate an 
aspect of activity which is different from 
the general visceral activity—mainly cardio- 
vascular—derived from the other measures. 
In Table 6 we have presented the correla- 
tions between our three self-report meas- 
ures, the total PT score, and four different 
indices of general physiological activity. 
The first measure is the one just discussed, 
the second is the sum of all 11 channels— 
including the GSR measures. The third and 
fourth measures are based on an argument 
by Lacey and Lacey (1958) and derived 
from the notion of response specificity ; this 
type of index uses the highest standard 
score for each subject in whatever channel 
itis found. Thus, for one subject the high- 
est standard score may be on a heart rate 
measure, for another it may be on periph- 
eral blood flow, and so forth. The third 
measure excludes the two GSR channels 
{гот this analysis, the fourth measure in- 
cludes them. 


TABLE 6 


Propuct-MoMENT CORRELATIONS AMONG Various ME 


ASURES OF PHYSIOLOGICAL ACTIVITY AND 


VERBAL MEASURES OF ANXIETY 


Verbal Sı hysiological Sum physiological Highest specific Highest specific 
CE т р with GSR without GSR with GSR 
PT score .325 (.05) .182 -210 —.027 
APQ —.228 í —.147 007 159 
МА .055 .236 “416 (.01) .599 (.01) 
Interview 317 (.10) .304 (.10) .343 (.05) .157 
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Several interesting results appear in Ta- 
ble б. We have already had occasion to dis- 
cuss the relation among PT scores, APQ 
scores, Interview scores, and physiological 
activity. As indicated above, the PT score 
is most efficiently predicted from a summary 
measure which excludes the GSR variables, 
while the APQ is negatively related to the 
same measure. The Interview shows a gen- 
eral positive relation with all of the meas- 
ures used, though again the highest correla- 
tions are obtained when the GSR is ex- 
cluded. The most striking discrepancy is 
found with the MA scores which fail to 
show any relation with the first summary 
measure, but very high correlations with the 
two specificity measures. 

We would offer the following interpreta- 
tion for these findings. In light of the lack 
of consistency between the GSR and the 
other physiological indices it might be use- 
ful to distinguish between two processes: 
emotionality and activation. Rather than 
forming a conjunction of these two proc- 
esses (cf. Duffy, 1957) we would argue 
that while emotionality usually implies acti- 
vation, activation need not necessarily be 
accompanied by emotionality. Considering 
the GSR as a measure of activation (cf. 
Woodworth & Schlosberg, 1954, pp. 144— 
159) and the other measures as an index of 
emotionality, these findings may be seen to 
be fairly consistent. As far as the specificity 
measures are concerned, we would argue 
that they are more in the activation than in 
the emotionality area; they measure a sub- 
ject’s highest level of activity and may thus 
be an index of highest activation of the sub- 
ject, wherever it occurs. 

In line with these notions, we would 
argue that PT scores are indices of emo- 
tionality rather than activation; they show 
subjects' disturbance in the presence of 
threatening material. Thus, PT scores are 
more directly related to the first summary 
measure than the others. The Interview 
scores are specific reports by the subjects of 
relevant body-perceptions, and the positive 
correlations provide an index of the reli- 
ability of such reports. We have already 
seen that these reports on the APO, how- 
ever, are probably not related by the sub- 


jects to the test situation used in this study. 
As a scale of habitual anxiety reactions the 
АРО is probably more closely related to the 
MA. While in the present study the corre- 
lation between APQ and МА is only a non- 
significant .262, it has ranged from .27 to 
.52 in our previous studies. It might be 
noted here that the МА is correlated .596 
and .361 with the two GSR measures indi- 
vidually, while the “emotional” PT scores 
are correlated —.342 and —.504 with the 
GSR measures. Thus, considering the GSR 
and the specificity measures as more directly 
related to activation, it seems reasonable to 
conclude that the MA is, in this situation, a 
quite good measure of activation. This find- 
ing is consistent with the general theoretical 
basis underlying the use of this scale as a 
measure of individual differences in drive. 
Theoretically we would argue that the anxi- 
ety dimension can be divided into at least 
two components, one of emotionality and 
one of activation or drive. Whenever these 
two dimensions are elicited in a subject he 
is likely to show high “anxiety” scores on 
the MA as well as on other measures, but 
measures of emotionality—such as the PT 
—need not be, and in this case are not cor- 
related with MA (r = .125). On the other 
hand, individual differences on the МА are 
likely to differentiate subjects on a drive or 
activation dimension, if not on emotionality. 


Relations among Individual Differences in 
Physiological Activity, Areas of Threat, and 
Response Modes 


We have already noted that high PT 
scores are associated with high over-all 
physiological activity. It remains to examine 
the relations among individual differences 
in the threat areas, response modes, an 
physiological activity. As before, the meas- 
ure of physiological activity will be the 
summary measure without the GSR meas- 
ures. 3 
The correlations between this measure - 
and subjects' scores in the four areas follow: 


NEUTRAL 437 
AGGRESSION .396 (p < .05) 
SEX 409 (p < .02) 
DEPENDENCY 232 
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Thus, variations in level of verbal disturb- 
ance in response to Neutral and Dependency 
phrases are not associated with variations in 
visceral arousal, while both Sex and Ag- 
gression are significantly associated with it. 
Two of our indices of anxiety—verbal dis- 
turbance score and physiological arousal— 
show parallel individual differences for the 
two classic areas of threat, but not for the 

Dependency area. 
| When we examine the relation between 
these areas and individual channels of phys- 
iological response, two physiological indices 
are of particular interest, GSR and tem- 
perature. 

Consistent with our previous discussion, 
PT scores on all four areas are negatively 
correlated with both GSR scores. The rele- 
vant correlations range from —.303 between 
Neutral and GSR, to —425 between Sex 
and GSR,. 

On the basis of some evidence in the lit- 
erature (e.g., Mittleman & Wolff, 1943) we 
hypothesized that subjects who show high 
degrees of anxiety in the area of Aggression 
would tend to show decreases in finger tem- 
perature, while subjects who were con- 
cerned with sexual problems would show 
_ increases in temperature. It will be recalled 

that one of our temperature measures (Т,) 

is an index of mean temperature rise, while 

another (T,) is an index of mean tempera- 

ture decrease, The relation between these 

two measures and scores on the Aggression 
_ and Sex areas is shown below: 


Ts Ts 
AGGRESSION  .272 401 (р < .02) 
SEX A45 (p < 02) 1 .148 


As predicted, we find significant positive re- 
lations between Aggression and T, and be- 
tween Sex and T,. Thus the temperature 
phenomenon usually associated with sexual 
arousal is generally true for subjects with 
conflicts in the sexual area; that associated 
- With aggression describes the temperature 
| behavior of subjects with aggressive con- 
| flicts in our task. 

.. In pursuing these relations further we 
“noted that these two temperature indices 
_ аге also positively associated with two of 
^ the response modes. Avoidance and tem- 
perature decreases are correlated 497, while 


Interference and temperature rises correlate 
374. In the case of temperature decreases 
we seem to encounter a triad with Avoid- 
ance and Aggression, having previously 
found a correlation between the last two of 
463. In exploring some possible causal re- 
lations among these three variables we re- 
sorted to partial correlational analysis. 
When Aggression is partialed out, the cor- 
relation between temperature and Avoid- 
ance drops only slightly to .326; however, 
when Avoidance is partialed out the relation 
between temperature and Aggression dis- 
appears (r = .064). Thus, it is likely that 
avoidance responding leads both to anxiety 
over aggression and to the drop in tempera- 
ture. Possibly the avoidance responses 
which represent a refusal to do the task may 
be viewed as aggressive in nature and this 
aggression toward the experimenter leads 
to increased anxiety over the aggressive 
stimuli and the resultant drop in finger tem- 
perature. In the case of temperature in- 
creases an exploration of the triad with Sex 
and Interference suggests, though less 
strongly, that preoccupation with sexual 
problems leads to interference and to tem- 
perature rises. The partial correlation be- 
tween temperature and Interference drops 
to 101 with Sex held constant, while the 
partial correlation between temperature and 
Sex also drops to .277 with Interference 
held constant, These analyses suggest that 
while Aggression scores may be situationally 
determined, the Sex scores are more likely 
to be more pervasive personality character- 


istics. К 
We turn now to the relation between re- 


sponse modes and physiological activity. 
The prediction was made that the three spe- 
cific defensive modes (Recoding, Rational- 
ization, and Personalization) should be dif- 
ferentially effective in preventing visceral 
discharge. According to our classification 
of response modes, subjects who recode, in 
effect, avoid the meaning of the threatening 
material. If successful, such recoding 
should result in lower physiological “anxi- 
ety.” Similarly, Rationalization and Person- 
alization should—according to our rationale 
of increasing personal involvement—lead to 
increasing physiological activity. Of the 
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other two modes, Interference—as an inde- 
pendent sign of high degrees of disorgan- 
ization—should be associated with a high 
degree of physiological activity, while 
Avoidance—as a pervasive anxiety response 
to the task—should probably also be asso- 
ciated with a high degree of visceral anxi- 
ety. Three indices were used to explore 
these relationships. The first is the average 
correlation between each response mode and 
nine physiological indices (excluding GSR), 
the second is the correlation between each 
response mode and the summary physiolog- 
ical measure, and the third is the mean sum- 
mary physiological activity measure for the 
five subjects highest in each of the response 
modes. These data are shown in Table 7. 
As far as the three specific defensive modes 
are concerned the data generally bear out 
our prediction. The two correlational 
measures show negative correlations be- 
tween physiological activity and Recoding 
and a positive correlation for the other two 
modes. Similarly the mean physiological 
score for the subjects highest on Recoding 
is lower than the score for the subjects in 
the other two modes. Statistical evaluation 
of the average correlational measure (anal- 
ysis of variance performed on z scores) 
shows a significant variation among re- 
sponse modes (F = 8.98, p < .01) ; on the 
second correlational technique separate 1 
tests indicate that the correlation with Re- 
coding is significantly different from the 
other correlations which do not differ sig- 
nificantly from each other; the analysis of 
variance for the mean physiological meas- 
ures is not statistically significant. It might 
be noted in addition that for all three anal- 
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yses Interference is most strongly associ- 
ated with physiological activity, while 
Avoidance is next highest. These data bear 
out our major prediction that Recoding as a 
defensive mode should be associated with 
low physiological activity. They lend fur- 
ther support to the utility of the Phrase As- 
sociation Test and the scoring methods 
employed. 


Stimulus Differences and Physiological 
Arousal 


The final analyses concern differences 
among stimuli in the degree to which they 
arouse physiological activity in our subjects. 
The first question asks whether the different 
areas—types of stimuli—differ in the de- 
gree to which they elicit physiological. re- 
sponses. Once again we used a composite 
measure based on the sum of standard 
scores for each stimulus of the eight phys- 
iological indices for stimuli described in the 
Method section, In the case of the stimuli 
we included the GSR measure since it did 
not show the pattern of negative correla- 
tions with other physiological measures 
which we found for the subject population. 
The means for the four areas on this sum- 
mary measure are shown below : 

NEUTRAL AGGRESSION SEX DEPENDENCY 

6.67 10.75 13.00 12.00 


An analysis of variance indicated significant 
variation among the four areas (F = 6.70, 
p < .05). However, Tukey's gap test per- 
mits us to state only that the Neutral 
phrases differ significantly from the other 
fhree, which in turn are not significantly 
different from one another. 


TABLE 7 


RELATIONS BETWEEN RESPONSE MODES AND TOTAL PHYSIOLOGICAL RESPONSE 


Rational- | Personal- | Avoid- Inter- 
Recoding | ization ization ance ference 
Average correlation with physiological indices | — .126 .142 .128 .223 241 
Correlation with physiological summary 
measure —.219 .240 .242 .391 424 
Shy (.05) (.02) 
Mean. physiological response of five highest 
subjects 12.6 16.6 17.6 20.2 20.8 


It will be recalled that three blank stimuli 
re introduced into the series of phrases 
order to check on the possibility of a con- 
ioned physiological response to projector 
ise and stimulus changes. An analysis of 
'the mean response to the three blank stimuli 
dicated that in no case was there a rise in 
response from the first to the last blank, 
which would have indicated conditioning. 
The physiological responses to the blanks 
either decreased or showed no change across 
time. In all cases the mean response to the 
blank stimuli was less than the mean re- 
‘sponse to the neutral stimuli. 

Having substantiated that reliable differ- 
ences exist between neutral and threat mate- 
‘tials as far as physiological response is con- 
cerned, we now inquire whether there is a 
relation between the physiological and ver- 
bal response to stimuli. The correlation 
between mean PT score and the summary 
"physiological measure for stimuli is .715. 
"Thus there is a strong association between 
the amount of verbal disturbance a stimulus 
‘elicits and the degree of physiological re- 
‘sponse to it. 

In contrast to the subject correlations on 
fesponse modes we find a significant posi- 
"five correlation for the summary physiolog- 
ical measure with Recoding (7 = 469, p < 
05), as well as a positive correlation with 
Interference (ғ = .524, p < .05).* Thus 
Phrases which tend to elicit recoding re- 
Sponses also elicit a high degree of physio- 
logical response, This is in contrast to sub- 
jects, where a high degree of recoding is 
related to low physiological responding. 
_ This juxtaposition nicely illustrates the dis- 
tinction between stimulus and subject cor- 
relations. What is likely to be the case for 
| stimuli is that those phrases which are most 
threatening elicit a high degree of physio- 
logical response from some subjects, but 
that subjects who tend to use Recoding as a 
major mode of defense will use it most fre- 
quently with these same threatening stimuli. 
Thus a high degree of threat in a stimulus 
can, on the average, result in visceral anxi- 
C — 

_ ^ The other three response modes are not signifi- 


E tly related to the physiological measure. 
a 
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ety, extreme verbal disturbance (Interfer- 
ence), and a potent defense, even though 
for individuals the recoding defense and the 
visceral discharge appear to operate alterna- 
tively. 

Adding these findings to our previous ob- 
servations on stimulus differences we can 
characterize the cluster of highly correlated 
stimulus effects as consisting of long reac- 
tion times, extreme verbal disturbance, high 
visceral activity, and primitive defenses in- 
volving recoding and denial. These charac- 
teristics confirm what are commonly postu- 
lated as typical effects of high degrees of 
threat and stress. 


SumMary AND CONCLUSIONS 


In these studies we have explored rela- 
tions among various modes of verbal re- 
sponse to threat, differences among types of 
threat areas, and physiological response to 
them. In undertaking a comparison between 
two widely used indices of anxiety—verbal 
disorganization and visceral response—we 
have been able to make effective use of a 
new research tool : Heath's Phrase Associa- 
tion Test. Using a 29-item checklist of sub- 
jects’ response to neutral and threatening 
phrases various findings substantiated the 
validity of this test. Our major findings 
may be summarized as follows: 

1. The verbal response measures reliably 
differentiated between various areas of 
threat—defined by phrase content—and in 
particular between neutral phrases and 
phrases with threat content. 

2. Physiological response also differenti- 
ated neutral from threatening stimulus 
items. 

3. Modes of response—using a five cate- 
gory classification of the signs of disorgan- 
ization—showed reliable relations. among 
one another. The findings substantiated a 
theoretical approach to verbal response 
which emphasized degree of personal in- 
volvement on the part of the subject. This 
was particularly apparent in the degree to 
which subjects who successfully avoid per- 
sonal involvement also show less physiolog- 
ical response to the stimuli. 
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4. The method showed some promising 
results in an attempt to distinguish between 
generalized anxiety evoked by the test situ- 
ation and specific anxiety reactions to par- 
ticular kinds of stimuli. ў 

5. Individual differences in modes of re- 
sponse were contrasted with differences 
among stimuli and their tendencies to evoke 
different kinds of response modes. The 
major finding here was the identification of 
a cluster consisting of behavioral interfer- 
ence, recoding or denial of meaning, high 
degree of physiological response, and long 
reaction times, all of which are apparently 
elicited by the same kinds of stimuli. 

6. By comparison with a previous study 
we concluded that a distinction can be made 
between reactions to intellective and to af- 
fective threat. Intellective tasks apparently 
are more likely to be disrupted by subjects’ 
perceptions of and preoccupation with bod- 
Пу events, while affective tasks—such as 
the Phrase Association Test—show dis- 
organization to be more dependent on actual 
physiological involvement. 


7. An analysis of self-report scales and 
various physiological measures suggested a 
preliminary distinction between two com- 
ponents of the anxiety syndrome: activation 
and emotionality. Activation is associated 
with individual differences in the galvanic 
skin response and scores on Taylor’s Mani- 
fest Anxiety scale. Emotionality is associ- 
ated with signs of verbal disorganization, 
situational awareness of bodily reactions, 
and physiological arousal in channels other { 
than the GSR. As a measure of individual 
differences, GSR measures tend to correlate 
negatively with other physiological indices. 

8. We were able to substantiate sugges- 
tions from the literature that concern with 
sexual and aggressive problems is differenti- 
ally associated with temperature increases 
and decreases, respectively. 

9. Results from a group Rorschach sug- 
gest that subjects who show a high degree | 
of anxiety on the Phrase Association Test 
will tend to use less manifest sexual and 
aggressive imagery on the Rorschach than 
subjects low in anxiety. 
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APPENDIX A 


PHRASES For STUDY I 


Neutral 

1. The horses worked well together 

. Steel company made new equipment 
. Farmer dug a new well 

Tugs helped ships reach port 

. Children given free summer camp 

. Architects planned home for family 


AAA 


ZAAA 


а 


Aggression 

. Father convicted for torturing son 
A,. Boy beat mother into unconsciousness 
Aj. Mother burned baby in bath 

A,. He suddenly struck his father 


> 


4 


a 

S.. Two male monkeys sexually embraced 
S,. Prostitutes do anything men desire 
S,. Female monkey tried mating male 

S,. He enjoys sleeping with men 


Dependency 
D,. Mother bears desert baby cubs 
D,. Father neglects his sick child 
D,. Father lions desert their cubs 
D,. Mother sent neglected child away 


APPENDIX B 


Purases For Stupy П 


Neutral 
, Architects planned home for family 
a He built his own boat 
a Circus gave them free passes 


» The dairy farm bought cows 

o The horses worked well together 

z Не overhauled the old motor 

в. Steel company made new equipment 


N 
N 
N 
N,. The craftsman designed new ornaments 
N, 
N 
N 
N 


Aggression 
Against 
A,. He suddenly struck his father 
A,. He spit in his mother’s face 
A, Boy beat sister into unconsciousness 
A,. He beat up his roommate 


From 
A,. Father convicted for torturing son 
As. Mother brutally beat her child 
A,. His brother kicked him in the stomach 
Ag. Student attacked by gang 


Hetero 
S,. His girl friend is very promiscuous 
S,. He propositioned the waitress 
S,. After the operation he was impotent 
S,. The prostitute slept with the student 


Homo 
S,. He enjoys sleeping with men 
S,. Homosexuals are easily recognized 
S,. He likes watching nude men 
S,. His roommate made a pass at him 


Dependency 
Rejection 
Father neglects his sick child 
She deserted her baby boy 
His brother refused to help 
. His roommate would not loan him money 


е ед] 


O 


Subjugation 
. He pleaded with his father 
. His mother had to support him 
. His sister had to protect him 
,. He needed help with his homework 


uo = 


gogo 


Competition 

He lost the game to his father 

His mother is smarter than he 

His brother is more popular than he 

He just missed the dean's list 

. He did not fulfill his father's hopes _ 

. His mother was disappointed with his 
grades 

. He did not get the promotion 

. He failed to make any team 


an 
qn a gk 


4 


оо 0000 
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APPENDIX C 


Scortnc MANUAL FOR THE PHRASE ÁSSOCIATION TEST 


| 1. Avoidance 
L 1, Comment on wording or phrasing, refer- 
Lence to task or to other phrases, explicit or 
“implicit (e.g, “These are all bad guys,” "It 
“looks like a headline"). 
2. Comment on physical aspects of stimulus 
material. 
3. Asks for repetition of phrase. 
4. Gives no substantive response, or says he 
“cannot think of response (whether or mot he 
fives a response). 
5. Denies own response or questions its ade- 
quacy (e.g., “No,” “I meant ...”). 
6. Simple repetition of stimulus or restate- 
“ment without addition of new content—may be 
synonymous expression. Major content of 
phrases must be restated in order to score. 
Also score if subject adds only “Why?” to 
repetition of phrase. 
7. Repeats exactly one or two words of the 


— stimulus only. 


8. One of three shortest reaction times in 
record, unless there are ties. 


Il. Interference 

9. More than one response, even if only 
fragmentary or if one of the responses is non- 
Substantive. Do not score if second response is 
a simple elaboration of the first, but always 
Score if there is clearly more than a one-phrase 


_ response. Do not score enumerations. 


10. Response is unfinished or broken. 
11. Repeats own response (one word or 


| more). 


12. Change in length from other responses: 
the longest (shortest) response is at least twice 
(no more than half) as long as the next longest 

(shortest). Or response is one of two (but not 
оге) which fulfill this criterion. 

18. Gives one-word substantive response or 


| . 
_ two responses which each consist of a single 


Word, 


14. Laughs or sighs. 

15. One of three longest reaction times for 
Sübstantive responses. Also score for absence 
of a substantive response. 


Ш. Recoding 


16. Misinterpretation or nonsensible re- 


А Sponse, 


E Evasion. Evades central notion of stim- 
lus by giving irrelevant or tangential response 


which has some connection with the stimulus. 
Major criterion is evasion of central meaning 
of the stimulus phrase, a failure to take into 
account the essential communication of the 
phrase, Also score if no substantive response. 

18. Reversal of meaning, eg. from “a does 
to b" to "b does to a” or from “a hates b" to 
“а loves b.” Also score if inserts positive qual- 
ities for actor engaged in reprehensible deed. 

19. Criticizes or questions clarity of meaning 
(e.g. “I don't understand that sentence"). 

20. Denial of truth of phrase, explicit 
(“People don't act that way") or implicit (from 
«а does x" to “b does not do x"). Explicit or 
implicit denial of stimulus or its consequences 
for the responder (“It doesn't matter," “Му 
mother wouldn't"). 

21. Intensification or approval of deviant be- 
havior. Frequently this changes meaning by 
making phrase seem ridiculously extreme—an 
undoing by intensification (e.g., “Апа then he 
raped her,” “So what!” “Good!”). 


IV. Rationalization 

22. Response is in the form of a question. 

23. Questions rather than denies the validity 
of the stimulus phrase. Expresses doubt rather 
than disbelief (eg, “I wonder,” "Does that 
really happen ?"). | 

24. Justifies ог defends central theme of 
phrase by invoking psychological motives such 
as character structure (*He was weak, ab- 
normal") or by inventing sufficient psycholog- 
ical causes for the act. Also score if asks for 
such an explanation or cause for the act or 
shifts responsibility within the phrase. 

25. Reference to norm, её, “х js usual, 
typical, common." Justification by reference to 


norm. 
26, Introduction of characters, 
subject's family or friends. 


other than 


V. Personalization 

27. Reference to self or family or name of 
friends or acquaintances. 

28. Any emotional reaction to statement; any 
value, ethical, or moral judgment. Must be 
clearcut in terms of our cultural norms, not 


just descriptive adjectives such as “smart.” 


29. Affect or value judgment is attributed to 
actor in phrase. 
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APPENDIX D 


Ѕсовімс MANUAL FOR THE ANALYSIS OF RORSCHACH RESPONSE CONTENT 


Responses to the Rorschach inkblots were 
scored for latent and manifest imagery in the 
threat areas of sex, aggression, dependence, and 
competition according to the criteria listed be- 
low. Responses which satisfied the criteria for 
two threat areas were scored in both areas. 


Sex 

Manifest: Includes responses which explicitly 
mention sexual organs or sexual acts. 

Latent: Includes responses which mention 
symbols of sexual organs and acts, e.g., swords, 
caves, long noses, vases, burlesque dance. 


Aggression 

Manifest: Includes responses which make ex- 
plicit reference to physical combat, e.g., mooses 
fighting, people kicking, pushing. 

Latent: Includes responses which refer to 
blood, death, decay, destruction, hostility, and 


APPENDIX E 


MATRIX OF INTERCORRELATIONS 


mutilation, e.g, scowling faces, bloody finger- 
prints, explosion, rotting carcass. 


Dependence 
Manifest: Includes responses which mention 
nurturance, shelter, guidance, e.g., nursing in- 
fant. | 
Latent: Includes responses which mention 
symbolic support, submission, weakness, rejec- 
tion, food supply. 


Competition 

Manifest: Includes responses which mention 
conflict not related to aggression, e.g, men 
arguing, haughty women ignoring each other. 

Latent: Includes responses which mention 
striving, upward movement, €g., ascending | 
bird. | 
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RESIDUA OF SHOCK-TRAUMA IN THE WHITE RAT: 


A THREE FACTOR THEORY* 
KENNETH H. BROOKSHIRE 
Franklin and Marshall College 


RICHARD A. LITTMAN 
University of Oregon 


AND CHARLES N. STEWART 
Saskatchewan Hospital, North Battleford 


Г 1s no longer necessary to argue for a 
| relationship between early experiences 
| and later adjustive behavior and capacities. 

An extensive clinical and experimental 
Mliterature has burgeoned in the past two 
decades. Initially, much of the work of ex- 
perimental investigators was prompted by a 
desire to explore and test the ideas of clini- 
| cal workers. However, by their probing, 
experimental investigators have established 
| a reverse tradition; now clinical workers 
‘tend to use the results of experimental re- 
"search as support and stimulation for fur- 
ther clinical observation and judgment 
(Bowlby, 1953). Such a “redress” is surely 
the prelude to a deeper and more penetrating 
union of clinical and experimental insights. 
In an area so rich in problems and data that 
have both theoretical and practical impor- 
tance, one has every reason to expect and 
hope for an amalgamation of experimental 
and clinical thinking. 


One of the oldest traditions of speculation | 


| about the relation between early experience 


A 1We gratefully acknowledge grants from the 
following sources: to Richard A. Littman from 
the Graduate School Research Fund of the Uni- 
versity of Oregon and Grants M-1695 and 
M-3801(A) from the National Institute of Mental 
| Health; to Kenneth Н. Brookshire, Grant APA-43 
from the National Research Council of Canada. 
At one time during the course of the investiga- 


T University of Oregon was the research assistant 
in several of the experiments. 


and adult behavior centers on the effects of 
traumatic and unusual experiences suffered 
by infant organisms. From the Wild Boy 
of Aveyron, to Riesen's chimpanzee (1958) 
and Harlow's rhesus (1958) there runs the 
thread of interest in what happens when the 
infant organism's environment is seriously 
modified. In all these studies, the main in- 
terest has usually not been in the pathology 
itself—fascinating thought it may be—but 
in the information it provides about the 
processes and outcomes of normal develop- 
ment. The present report is in this tradition, 
dealing with the hypothesis that intense and 
unusual stimulation early in life may pro- 
duce profound and persisting effects on later 
behavior. 

Much of the research in this area has 
focused on the effects of unusually impover- 
ished or enriched environments where treat- 
ment extends over long periods in the life 
of the organism. This program, however, 
was designed to investigate the effects ofa 
relatively narrow range of intense stimula- 
tion within a brief period of the life of im- 
mature rat pups. More recently there has» 
been an increase in the number of investi- 
gators also studying more limited and pre- 
cisely known treatments (Ader, 1959; 
Denenberg & Bell, 1960). Nevertheless, 
there is a good deal of similarity between 
the objectives of the more narrowly defined 
research and the earlier investigations. They 
both are interested in the effects of different 
opportunities during infancy for perceptual 
and problem solving activities, and they both 
seek to discover and understand the nature 


2 K. H. BROOKSHIRE, R. A. LITTMAN, ann С. N. STEWART 


of these experiences and the mechanisms and 
processes by means of which they determine 
later behavior. 

Because there have been several thorough 
reviews of recent research we shall not un- 
dertake one here. The final section, however, 
contains a discussion of some related investi- 
gations in connection with a number of ques- 
tions and problems raised by our findings. 
It seems fair to say, however, that all of 
the reviewers agree on the necessity for 
more careful and systematic work because 
the practice of postulating some infantile 
conditions or experience to explain a set of 
Observations upon adults leads to an un- 
critical and ready acceptance of whatever 
favorable material appears. If the explana- 
tion of certain observations of adult be- 
havior rests upon matters which can be 
directly and fairly easily studied, then such 
investigations should be carried out. This is 
surely the only way to temper “wild” appeals 
to early experience. 


Methodological Considerations 


As in other research areas there are cer- 
tain unique methodological problems which 
confront investigators and considerable at- 
tention has been given to them by recent 
writers (Ader, 1959; Baron, Brookshire, & 
Littman, 1957; King, 1958). Although it 
is true that studies of early experience must 
contend with design problems which are 
often more demanding than those in other 
research areas—the longitudinal nature of 
the research invites the intrusion of illness, 
death, uncontrolled experiences, or variations 
in "standard" laboratory conditions, etc., 
between treatment and test—it is also true 
that our knowledge of the effects of early 
experience has suffered from inadequate 
attention to methodological details which can 
be dealt with. There are six points which 
are especially crucial. 


Meaning or Denotation of Early Experi- 
ence. As King (1958) and Baron, Brook- 
shire, and Littman (1957) have indicated, 
the term “early” is vague and has been used 
to cover a wide age range. If the effects in- 
volved are in any way a function of the de- 


velopmental level of the organisms studied | 
then some more precise way of indicating 

such levels is necessary. Hence King has 

suggested, for example, that the rat's life 

may be broken into three phrases—infantile 

(0-20 days), adolescent (21-70 days), and 

adult (71- ). By this criterion, for ex- 

ample, the research of Baron, Brookshire, 

and Littman (1957) which purported to deal 

with infantile experience actually dealt with 

adolescent experience; now the present 

authors use the term “weanling” to refer to | 
the same early experience phase. It is clear 

that some unambiguous distinctions are re- 

quired if the effects of early experience are 

to be understood. 


Previous Experience and Early Experi- 
ence. Demonstrating that there are differ- 
ences at maturity between a control and an 
experimental group exposed to varying early 
experiences is not a sufficient basis for say- 
ing that this is an effect of early experience. 
The theories of Freud and Hebb, for ex- 
ample, assert that experiences early in life 
have effects which are unique and irrevers- 
ible. This implies that the same experiences 
later in life will not have the same conse- 
quences. In order to test this, however, it is 
necessary to’ include in the design of the 
experiment an adult-experience control 
group, a procedure which, as Baron, Brook- 
shire, and Littman (1957) and Ader (1959) 
point out, has rarely been adopted. As an 
illustration of the significance of this factor, | 
the study by Baron, Brookshire, and Littman 
showed that the same traumatic treatment of | 
weanling and adult animals resulted in the 
same behavior during an adult test, and that 
the behavior differed from that of untreated 
control subiects. It is certainly true that 
such “negative” findings for theories pro- 
posing unique effects early in life must be | 
viewed in terms of the particular experi- 
mental conditions. Nevertheless, it is a clear 
warning that an adult experimental group 
paralleling infantile or adolescent groups 
must be provided when attempting to deter- 
mine the effects of early experience On | 
Organisms; otherwise, these studies will 
simply reduce to investigations of the effects 
of Previous experience. Previous experience 
is not an age defined characteristic and as а 


————ÀÀÀÀ 


as 


theoretical term certainly conveys quite a 


"different sense from early experience. 


Comparing Different Investigations. Ac- 
cording to King (1958) there has been a 
“lack of attention to variables other than the 


Lone being manipulated” in most early ex- 


perience studies. It is, therefore, difficult, 
if not impossible, to compare most studies of 
early experience with one another in an 
attempt to cross-validate results. For ex- 
ample, most studies differ in one or more of 
the following ways—age at time of treat- 
ment, age at time of test, duration of treat- 
ment or test, interval between treatment and 
test, etc. Such differences raise one of the 
most crucial problems confronting psycholo- 
gists in general, viz., the lack of a rational 
and lawful taxonomy for differences in 
task and behavior parameters; in other 
words, where there are differences in tasks 
and behavior measures, for example, jump- 
ing apparatus, linear or open-field mazes, 
Skinner-type problem boxes, discrimination 
boxes, obstruction problems, etc., it is im- 
possible to integrate the conflicting findings 
that presently characterize the area. Except 
for the dramatic demonstration that there 
are negative or positive effects of a general 
nature attributable to massive restriction or 
expansion of experience possibilities in early 
life, detailed comparisons are virtually im- 
possible. 

One-Experiment Pattern of Research, An 
especially characteristic feature of research 
in this area is the large number of investi- 
gators who conduct only one study; fur- 
ther, one rarely finds replications either of 
one’s own work or that of others. This, 
coupled with small effects and usually 
marginal probabilities wherever differences 
are assessed, makes it difficult to know what 
"weights" to assign to different investiga- 
tions. As a model contrast, we know a good 
deal about the relation between early experi- 
ences and subsequent hoarding behavior be- 
cause of the careful design and replications 
conducted by Hunt and his colleagues 
(Hunt, 1941; Hunt, Schlosberg, Solomon, 
& Stellar, 1947; Hunt & Willoughby, 
1939) on the one hand and Marx and his 
co-workers on the other (Marx, 1950a, 
1950b, 1952). 
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Confounding of Test Results. Stemming 
from the “one-experiment” pattern and the 
tendency to confuse early with prior ex- 
perience there is a tendency to use a variety 
of adult tests on the same animals (Ader, 
1959; Griffiths & Stringer, 1952). Under 
such circumstances, unless there already ex- 
ists a considerable body of control informa- 
tion, one cannot separate the roles of early 
and immediately prior experience. It is es- 
sential, at the very least, to counterbalance 
conditions for groups; for example, if one 
wants to know the effect of, say, Treatment 
X on maze learning and avoidance learning, 
all subjects should not be studied first on 
the maze and then on an avoidance test 
apparatus. 

Limited Analysis. While it is indeed im- 
portant to demonstrate that there is some- 
thing about a previous treatment that in- 
fluences the test behavior of subjects, there 
is much more that requires doing. If the 
demonstrated relationships are to be more 
than just dramatic but isolated findings, it 
is necessary to tease out the mechanisms 
involved; that is, what are the independent 
variables, to what dependent variables are 
they related, and in what manner? When 
the answers to such questions emerge we 
can then say we know which concepts must 
be used, rather than which ones may be 
used. 


Overview of Studies 


Prompted by the positive findings of an 
earlier study (Baron et al, 1957), a con- 
nected series of six experiments dealing 
with the consequences of severe electric 
shock was undertaken. The aims of this 
series were: (a) replication of the previously 
obtained long-term effects of intense stimu- 
lation; (b) discovery of the conditions for 
introducing shock residua into the repertoire 
of organisms; (c) specification of some of 
the properties of such shock-induced residua, 
for example, how long do they last, how 
may they be altered, are there any critical 
periods, etc.; (d) description of the rela- 
tionship of these residua to adult learning 
and emotional phenomena. 
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Throughout the report, the terms “resid- 
ual,” “shock residua," and other gram- 
matical variants will be used. This neutral 
usage has been adopted because there are a 
number of different ways of conceiving of 
the phenomenon under investigation. It 
might, for example, be called “fear,” “pain,” 
“anxiety,” “emotional habit,” etc., but since 
these and similar terms carry special mean- 
ings for different investigators we have 
elected to keep the report of the experi- 
mental findings clear from contaminating 
allusions. In the discussion section, an at- 
tempt will be made to relate this empirical 
“residual” to some of its putative relatives 
and their characteristics, 

One other introductory comment about 
the studies should be made. In some respects 
this research program is isolated from that 
of other workers in the field. In spite of 
the obvious relations “by problem” to other 
work, the studies have been cast within a 
fixed design which does not match in any 
substantial detail the designs or procedures 
of other investigators; this was done to 
facilitate parametric analysis. Nevertheless, 
it is possible to draw out some broad rela- 
tionships between the results of our work 
and that of other investigators and this will 
be attempted in the discussion section. 


Experiment I. Age and Retention In- 
terval as Factors Influencing the Effect of 
Trauma. The first study (Baron et al, 
1957) had used an interval of 100 days 
between weanling trauma and adult test for 
the residual but had only a 1-day interval be- 
tween adult trauma and test. Since weanling 
traumatized subjects performed at the same 
level as adult traumatized subjects relative to 
controls, it was possible that the initial im- 
pact of trauma upon pups is greater than 
upon adults but dissipates with age, hence 
resulting in an attenuated residual. There- 
fore, a 100-day interval between trauma and 
testing was studied for both weanlings and 
adults. 


Experiment II. Role of Duration and 
Frequency of Trauma in Establishing Resid- 
ual. In the initial phases of the research we 
had arbitrarily selected certain fixed dura- 
tions and intervals for traumatization. But 


there is no reason to expect the effects of 
trauma to be uniform and independent of its 
duration or frequency. Therefore the conse- 
quences of different frequencies and dura- 
tions of trauma were studied. 


Experiment ПІ. Behavior Possibilities 
for the Animal at the Time of Trauma. It 
is possible that the effect of intense shock 
is so massive that almost none of the cir- 
cumstances surrounding its administration 
can alter the effect of the residual upon 
subsequent behavior—the effect is always 
and everywhere the same for given intensi- 
ties. To psychologists interested in learning 
this is an unpalatable possibility; it seems 
more reasonable to assume that there is some 
kind of learning at the time the shock is 
administered and that the effects of this 
experience may be adaptive, maladaptive, or 
irrelevant for future behavior as a function 
of relations between trauma and testing cir- 
cumstances. Consequently, eight groups 
were studied where the conditions of shock 
allowed subjects to behave in different ways. 


Experiment IV. Residua and Adult Be- 
havior under Dissimilar and Unstressful 
Test Conditions. In the event the general 
hypothesis that trauma leaves residua is 
true, it is important to know whether the 
residua require stress to manifest them- 
selves or whether they produce a general 
"characterological" alteration of large, if 
not all, portions of the organism's repertoire. 
This was tested by running subjects in a 


? Nor, for that matter of its intensity. How- 
ever, we leave for the future a report on the 
effects of shock level for the same trauma and 
test parameters. Even so, the previous study sug- 
gests that a moderate and severe shock at the time 
of testing do not differ appreciably in their ability 
to reveal trauma residua. Whether moderate an! 
weak shock are equivalent in establishing residua 
is another matter entirely, and on this we have 
no information from our procedures. Nevertheless, 
the work of Denenberg and his colleagues suggests 
that the strength of residua varies with that of 
trauma. Since he studied mice it is possible that 
such results would not hold for rats; this poss! 
bility is enhanced by the fact that he did not fin 
an interaction in rats between infantile handling 
and avoidance learning to different levels of shot 
(Denenberg & Karas, 1960) as he did in the mice 
(Denenberg & Bell, 1960). 


| 


re 


closed-field test and comparing activity and 
‘exploration for control and traumatized 
groups. 

Experiment V. Residua and Adult Be- 
havior under Similar and Stressful or Un- 
stressful Test Conditions. Part of the an- 
swer to how adult behavior under stress and 
nonstress reflects early experience depends 
on knowing if the adult test situation must 
resemble, that is, have cue properties similar 
to, the trauma situation. Consequently, 
several different trauma conditions were 
used and animals subjected to them as 
weanlings were then tested for the role of 
environmental cue factors versus more gen- 
eral responsivity changes. 

Experiment VI. Residua and Adult Be- 
havior under Nonshock Stress—Mild and 
Severe Hunger. Also closely related to Ex- 
periment IV is the question of whether stress 
conditions different from those imposed 
during the weanling period will differentiate 
between treated and control subjects. Even 
if shock residua appeared during unstressful 
test circumstances, it is possible that its effect 
might be enhanced by other forms of stress. 
Therefore, animals who had been trau- 
matized with electric shock as weanlings 
were tested for running speed to a goal box 
containing food when they were hungry.* 


GENERAL DESIGN 


The detailed designs, procedures, and results will 
be given separately for each experiment. How- 
ever, there are a number of features common to 
all the experiments and these will be presented 
together to aid the reader in grasping the over-all 
framework of the studies. 


Subjects 


The total number of Ss was 330, of which 177 
were males and 153 females. Ss were bred and 
reared in the University of Oregon laboratories, 
except those used in Experiment У; all were from 
the Sprague-Dawley strain. 


3 In addition to the experiments reported here, 
there have been several studies dealing with the 


for publication, but the authors will be glad to 
furnish advance summaries to readers who might 
be interested in the results. © ` -` 7 
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In each experiment subgroups were formed and 
balanced for weight, sex, and litter at the time of 
weaning. With the exception of Experiment IV, 
which used only male Ss, all the experiments had 
an equal number of males and females per sub- 
group. 

The N for subgroups was equal in each experi- 
ment, though different experiments had subgroups 
of different sizes. Table 1 shows the N for each 
experiment. 


Apparatus 


This section will describe all the items of equip- 
ment used under two headings, trauma and test. 
The experiment in which each appeared will be 
indicated. 


Trauma 


Shock Source. A neon sign transformer was 
used to step up the normal line voltage to 7,500 
volts. The circuit into which the transformer was 
wired contained 6-megohms resistance so that the 
current output was approximately 1.25 milliam- 
peres. While this output was not continuously 
monitored it was periodically checked and in no 
case varied more than 3%. The voltage and 
current levels used were, therefore, very high and, 
coupled with the relatively low resistance varia- 
tions attributable to individual differences їп 
Ss, provided us with a very stable traumatizing 
agent. There was no commutator in the circuit. 

In general, the voltage and current levels used 
in these studies were considerably higher than 
those in most other investigations in this area. 
This provides the occasion for a methodological 
note, harkening back to some points raised earlier. 
While there has been a great deal of research 
upon the properties of various shock circuits in 


TABLE 1 
ALLOCATIONS OF SUBJECTS PER EXPERIMENT 
Subgroup 
Experi- Total Number N 
ment N of (4 male, 
T а subgroups | 4 female) 
IL p CONSTI Lu La 
I 96 12 8 
II 40 5 8 
ш 80 8 10 
IV 24 3 8* 
V 60 6 10 
VI 30 3 10 
Total 330 
a Only male subjecte: t f 
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relation to behavior, there are only the beginnings 
of a systematic behavioral picture of what happens 
at given shock levels for naive animals, let alone 
sophisticated ones. Consequently, the voltage and 
current levels are the only dependable index we 
have of the severity of trauma. This makes it 
essential to consider carefully the properties of 
different circuits when comparing investigations. 
What may be described as a "high" or "strong" 
shock in one study may be "weak" or "low" rela- 
tive to another study. And, of course, until a 
systematic body of data exists for the effects of 
a graded series of shock strengths, how can one 
compare or integrate the findings of different 
studies ? 


Shock Box. This, the main traumatization de- 
vice, was a windowless black box, 9" wide, 18" 
long, and 11” high with a grid floor having i" 
bars, 2” apart. It had a hinged top by means of 
which Ss were introduced and removed; there was 
no light inside the box. It was used in all experi- 
ments, I-VI. 


Harness. This was a nonconducting masonite 
board upon which animals were spreadeagled and 
strapped, ventral surface down, with electrodes 
attached to the front paws so that current neces- 
sarily passed through the body. The device per- 
mitted no up-and-down movement and only a 
minimal kind of motion along the body and limb 
axes, a “hunching-up,” so to speak. The device, 
with the animal attached, was placed on the top 
of a laboratory table and then wired into the same 
current source as the shock box. This was used 
in Experiments III and V. 


Grid Runway. Constructed of masonite with 
grid floors, the device was 5' long, 6" wide, and 
8" high throughout. It was divided into three 
portions: a dark brown start box (12" x 6"), 
a runway (36" X 6"), and a white goal box 
(12" x 6"). The start box had a grid floor that 
was continuous with that of the runway while the 
floor of the goal box was masonite. Between each 
section was a guillotine-type door controlled by 
overhead cords. The entire apparatus was housed 
in a plain room with a single fluorescent light fix- 
ture high overhead oriented to the long axis of 
the device. This was used in Experiments I, II, 
and III. 


Closed Field with Grid. Constructed of mason- 
ite, the device was 30" x 30" x 6". The walls 
were painted black and above the floor, which was 
white and divided by black lines into 36 5" 
squares, was a grid for administering shock, The 
apparatus was raised 3' from the ground and 
placed in a room with permanently installed black- 
out curtains on the windows. A single fluorescent 
ceiling fixture provided a constant source of illu- 
mination. This was used in Experiment V. 

Cold Water Tank. A small circular tank, 12" 
in diameter and 24" deep, with water maintained 
at 37°F. was used in Experiment V. 


Test 


Four devices were used for testing: grid run- 
way—described in preceding section; closed field 
with grid—same as in preceding section; closed 
field without grid—same as shock box but without 
grid; and elevated runway—a straight runway, 
9' long, 2" wide, painted black and raised 3' above 
the floor. 


Scores 


There were several sorts of measures used in 
the various studies. We shall describe here only 
those for the grid runway test which was used in 
the first three studies. The other measures will 
be described in the experiments in which they 
were used. 


Running Time 


Five seconds after 5s were placed in the start 
box, the guillotine door separating the start box 
from the runway was raised, providing a com- 
bined visual and auditory CS for the rat. Raising 
the door initiated a 2-second "delay" period after 
which the UCS (shock) was switched on by 
means of an automatic interval timer. Running 
time was the interval between raising the door and 
entrance into the goal box; the goal box floor had 
a pressure-plate switch which stopped the timing 
circuit started by raising the guillotine door in the 
start box. 


Nontime Measures (Achievement) 


Escape Responses. Those responses which oc- 
curred after shock onset. 


Incomplete Avoidance Responses. Those re- 
sponses where movement into the runway out of 
the start box occurred before shock onset but 
which did not bring the 5 to the goal box before 
the grid was charged. In this instance, anticipatory 
behavior occurred but S failed to act quickly 
enough and therefore received primary (shock) 
reinforcement. 

Avoidance Responses. Those responses which 
permitted $ to reach the goal box before shock 
onset. 

It should be noticed that membership in one 
achievement category excludes membership 
either of the other two. 


Procedures 


Only those procedures which were followed in 
all of the studies will be described. 

Weaning. All Ss were weaned at 20 days of 
age. From that date on, they were maintained in 
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front-opening individual cages, mounted in battery 
racks.* Water bottles and bulk feeders were used. 


Handling. With the exception of specific experi- 
mental procedures requiring contact by manipu- 
lation, Ss remained in their individual cages 
throughout the experiment. 


Weanling Trauma. Ss were shocked from 21-25 
days of age for 2 continuous minutes per day 
except where otherwise indicated. Experiment II 
should be consulted for leads as to the parametric 
implications of this schedule of traumatization. 


Escape Learning. Escape learning consisted of 
those trials early in the learning series before Ss 
anticipated shock onset. Five seconds after ani- 
mals were placed in the start box, the guillotine 
door leading to the runway was raised and after 
a 2-second delay the grid was charged. This pro- 
cedure was not followed in Experiment T; instead 
of a 2-second delay between raising the door and 
shock onset, the two occurred simultaneously. 
There were five learning trials per day for 10 
successive days; trials were 3 minutes apart and 
Ss remained in the goal box for 5 seconds before 


being removed. 


Avoidance Learning. Avoidance learning con- 
sisted of the same procedure described above for 
escape. A running score of less than 2 seconds 
meant complete avoidance while a score of more 
than 2 but less than 4 seconds meant incomplete 
avoidance. In Experiment I, where the escape 
procedure was different from all other studies, the 
avoidance procedure was the same, with the usual 
2-second interval. There were five learning trials 
per day for 10 successive days; trials were 3 min- 
utes apart. Ss remained in the goal box for 5 
seconds before being removed. 


Tue EXPERIMENTS 


This section presents specialized pro- 
cedures, results, and a brief discussion for 
each of the experiments. The concluding 
discussion takes up the implications of the 
experiments as a group. 


Experiment 1: Age and Retention Intervals 
as Factors Influencing the Effect of Trauma 


Probably the most important develop- 
mental hypothesis of recent years proposes 
that there are critical periods in the life of 
the organism; it is only at such times that 
certain experiences may have any effect 
at all or may have a maximal effect. Work 


‚ *Cages were No. 409 and racks were No. 410 
in the lines manufactured by Bussey Products 
Company; Chicago, Illinois. 


on imprinting (Thorpe, 1956) and the in- 
vestigations of Harlow (1958) are very 
suggestive here. Is this also the case for 
traumatic experiences? An earlier attempt 
(Baron et al., 1957) to answer this ques- 
tion yielded negative results, viz., weanling 
and adult traumatized Ss did not differ from 
one another on adult tests though both 
differed from controls. In that study, age at 
trauma and interval between trauma and test 
were confounded, leaving open the possi- 
bility that weanling effects are greater than 
adult effects but dissipate with age if not sus- 
tained by other experiences. In this experi- 
ment, therefore, the effects of age at trauma 
and interval between trauma and test are 
separated out. 


Procedure 


There were 96 Ss (48 males, 48 females), all of 
whom were weaned at 20 days and placed on an 
ad lib. feeding schedule in individual cages. The 
shock box and grid runway were used, Table 2 
outlines the experimental schedule, 


Treatment. Each of the 12 groups was matched 
with respect to weight, sex, and litter and was 
formed at the time of weaning. Experimental Ss 
received 2 minutes of continuous shock on each 
of 5 successive days, while control Ss were neither 
shocked nor handled, As may be seen from Table 
2 each experimental trauma group had a separate 
control. 

Since the events under trauma are 50 uniform— 
except in some of the conditions in Experiment 
III—we present here a general account of what 
happens when shock is applied. i 

A 125-milliampere current induces frantic run- 
ning accompanied by urination and defecation. If 
shock duration is long then running yields to tonic 
immobility; this immobility is not the result of 
fatigue, apparently, since upon shock termination 
Ss immediately revert to active state. 

There is a great deal of leaping, as well as 
running, under shock; usually this concentrates 
upon the walls of the box rather than being un- 
differentiated leaping. By the third or fourth day, 
Ss begin to jump immediately on being placed in 
the box, that is, before shock is applied. 

The paws of some Ss occasionally bleed during 
trauma; either the frantic running itself or the 
electrical arcing from the grid could produce this. 
Since adult animals do not show these injuries it 
js clearly the more delicate tissue of the pups 
which makes some of them susceptive to damage. 

No permanent defects or deaths which could be 
attributed to the shock were observed. The hy- 
pothesis that a selection process is in operation 
(whereby only the hardier animals survived the 


8 K. H. BROOKSHIRE, R. A. LITTMAN, лхр С. N. STEWART 


TABLE 2 
OUTLINE OF EXPERIMENTAL SCHEDULE FOR EXPERIMENT І 
Age at à 

Group N time of treatment Age at Testing 

(Total 296) (trauma) time of testing situation 
EL 8 20-24 days 125-132 days Escape 
EI 8 20-24 days 125-132 days Avoidance 
СІ, 8 (No shock) 125-132 days Escape 
C 15 8 (No shock) 125-132 days Avoidance 
E Il, 8 120-124 days 125-132 days Escape 
E II» 8 120-124 days 125-132 days Avoidance 
єл 8 (No shock) 125-132 days Escape 
C Il, 8 (No shock) 125-132 days Avoidance 
E III, 8 120-124 days 225-232 days Escape 
E Ill, 8 120-124 days 225-232 days Avoidance 
C III. 8 (No shock) 225-232 days Escape 
C IIl, 8 (No shock) 225-232 days Avoidance 


trauma to be tested as adults) is certainly not 
tenable. 

Finally, it is apparent that taming occurs during 
the infantile treatment phase of the experiment. 
Not only do the handled groups adapt to the 
experimenter (E) during the course of treatment, 
but the shock groups also do. It should be empha- 
sized, however, that with regard to the shocked 
animals this “tameness” is evident only before the 
daily treatment. Immediately after treatment Ss 
are difficult to handle, making explicit escape re- 
sponses on being picked up by E. 

Test. Ss were given the escape and avoidance 
training described in the section on General De- 
sign. It should be noted that this experiment is 
the one in which escape training involved simul- 
taneous shock and door raising, in contrast to the 
remaining experiments. There were five training 
trials per day for 8 successive days; trials were 
spaced 3 minutes apart. 


Results 


The results for escape and avoidance 
learning are presented in Table 3. 


Escape Learning in “Escape Situation.” 
Escape behavior was analyzed for the early 
trials, 1-5, and later trials, 6-40. An 
analysis of variance for Trials 1-5 with ex- 
perience (shock or no-shock), trauma-test 
interval, and sex as primary sources reveals 
trauma-test interval and the trauma-test in- 
terval X experience interaction to be signifi- 
cant at the 5% level. By Tukey’s gap test 


it would appear that the group contributing 
most to the interaction variance is Group 
E II, (see Tables 2 and 3), that is trauma- 
tized on Days 120-124 and tested immedi- 
ately afterward beginning Day 125 until Day 
134. This group acquires the escape running 
habit under shock more quickly than either 
the control groups or Ss exposed to shock 
100 days prior to escape training (regard- 
less of age at time of trauma). 

The analysis of variance for Trials 6-40 
reveals only one source significant at the 5% 
level, experience. Inspection of Table 3 
shows that traumatized Ss in all” three 
groups have longer running times than 
their nonshocked controls. 


Escape Reactions in the “Avoidance Situa- 
tion.” The early trials of avoidance learn- 


ing are very much like escape learning con- ~ 
ig ty 


ditions since anticipatory responses have to 
be developed. This may be seen from Figure 
1 which is a plot of the three types of re 
sponses in the avoidance situation. It is not 
until the fifth day of testing that avoidance 
frequencies approach escape-type behavior. 
For the first day—Trials 1-5—there ате 
never any complete avoidances though 0c- 
casional incomplete avoidances appear; these 
data are based on the Ss in Experiment IIT 
Hence, we have analyzed the avoidance trii 
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in which the response did not occur until 
the grid was charged, so that running Was 
to escape shock. It can be seen from Table 
3 that the escape behaviors under avoidance 
operations have the same pattern that they 
do under escape operations, though Trauma 
Groups I and ПІ do somewhat better 
(probably revealing some influence from the 
growth of anticipatory behavior). In any 
event, an analysis of variance reveals sex 
and the trauma-test interval interaction to 
be significant sources of variance. When 
the Tukey gap test is applied, it is seen that 
the E IL, (immediate test) group escapes 


more rapidly than any other experimental 
or control group, replicating the findings for 
the “риге” escape situation. 

Number of Incomplete Avoidances. In 
Table 3 it may been seen that the experi- 
mental Ss consistently have more incomplete 
avoidances than the controls, that is, running 
before shock begins but failing to reach goal 
box before shock onset. However, an 
analysis of variance did not yield any 
significant sources of variance, intragroup 
variability being very great. 

Number of Complete Avoidances. Table 
3 shows how the groups compared on com- 
plete avoidances. The superiority of experi- 
mental animals over controls (paralleling 
those in the incomplete avoidance measure) 
stands up here; an analysis of variance 
shows both experience (traumatized or non- 
traumatized) and sex to be significant 
sources. Note that here there is no interac- 
tion, such as found for escape responses, 
between trauma and test interval. 

Age at Traumatization. Whether or not 
the age at which traumatization occurred is 
related to residual effects is one of the 
major objectives of this experiment. The 


TABLE 3 
RESULTS FOR ESCAPE AND AVOIDANCE IN EXPERIMENT 1 
Trauma-test interval group 
Subjects 
Situation E(xperimental) Trials 
C(ontrol) 1 п ш 
(20-125) (120-125) | (120—225) 
Е 
е; Е 1-5 5.47 2.27* 4.85 
Mean running time (seconds) is A ү» 255 D 
С 6-40 1.49 1.50 1555 
Avoidance К 
Mean running time in seconds tor 
escape-type responses (includes E 1-5 5.74 4.37* 5.60 
2-second delay) (ex 1-5 5.64 5.82 5.55 
Mean number of incomplete E 1-40 7.62 7.78 8.50 
т; о ш 
M id: E : . 0. 
ean number of avoidances e UD Us Die 029 


* Differs at 1% level by Tukey вар test from other E and C groups. 
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answer is clear-cut: there is mo effect at- 
tributable to age at trawmatization. This 
may be seen by comparing the behavior of 
animals traumatized at 20-24 days (Group 
I) and 120-124 days (Group III), for 
whom the interval between trauma and test 
is equal though the age at time of test is 
not. As will be seen from Table 3, the 
running times and relative frequency of 
avoidance behaviors are of the same order 
of magnitude for both groups. It should 
be noted that the experimental Ss do, of 
course, differ from the controls. 

Age at Time of Testing. If we inquire 
whether the age at test is related to trauma- 
tization experiences, the answer is also 
negative. It will be seen from Table 3 that 
whether animals are tested at 125 days of 
age or 225 days they behave substantially 
the same. This particular generalization is 
clouded by the superior escape behavior 
of Group II (the "immediate" group) in the 
first five trials; the difference seems to 
depend upon the interaction of test age X 
trauma-test interval according to the analysis 
of variance referred to above. Since the 
interaction dissipates when the full range of 
40 trials is considered it seems safe to con- 
clude that the test age is an irrelevant vari- 
able. 

Interval between Trauma and Test. Here 
we find the only instance in which the ex- 
perimental groups are split off from one 
another relative to the control groups. 
Group II, the immediate group, shows bet- 
ter escape behavior than the other experi- 
mental animals but also is superior to the 
control Ss. This point should be kept in 
mind because it is the only instance in this 
entire series that traumatized animals show 
Superior escape behavior. The effect of a 
brief trauma-test interval does not appear 
in connection with avoidance behaviors 
Where, as will be seen in all subsequent ex- 
periments, experimentals do better than con- 
trols. Therefore, our data suggest that long 
intervals between trauma and test do not 
alter the influence of trauma residuals, re- 
gardless of the age at which traumatization 
or testing occurs. Short intervals, on the 
other hand, seem to make the residual ad- 
vantageous for escape behavior, though only 


for a brief period, since the mean escape 
behavior for Trials 6-40 shows Group II 
behaving at the same level as the other two 
experimental groups (Table 3). 


Sex Differences. We do not present any 
tables in which sex differences are analyzed, 
However, females behaved more poorly than 
males under both escape and avoidance con- 
ditions. This was true for both control and 
experimental 5 and since there is no inter- 
action between sex and any of the other 
variables it probably reflects nothing more 
than the general slowness of running in 
female rats relative to males. 


Conclusions 


l. The effects of trauma upon weanlings 
are the same as its effect upon adult animals; 
escape learning is hampered and avoidance 
learning is aided. If there is a "critical" 
period, it is somewhere before Day 20 or 
between Days 35-120. 

2. Trauma residuals operated upon 125- 
and 225-day-old animals to the same degree. 

3. An interval of 100 days between 
trauma and test has the same influence 
whether trauma occurs in weanlings or 
adults. 

4. There is a transitory benefit for escape 
if testing occurs a short time after trauma. 

5. The effect of traumatization, that is, 
the influence of residuals, may be advan- 
tageous or disadvantageous depending upon 
the task confronting the 5, There is 
an interaction between residuals and be- 
havior setting so that there can be no 
maxims to the effect that "Trauma is harm- 
ful (independent of test circumstances)” of 
that “Trauma is beneficial (independent of 
test circumstances) .” 


Experiment II: Role of Frequency and 
Duration of Shock in Establishing Residua 


Once it was clearly established that à 
residual of shock-trauma did indeed exist, 
the next logical step was to explore the con- 
ditions under which the trauma was admin- 
istered. We first turned our attention to 


the frequency and duration of the shock, 4 


leaving variations in shock intensity for 4 


i 
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future series of replications. The need for 
this sort of parametric investigation is 
obvious, for it is perfectly possible that we 
had just happened to hit upon a lucky com- 
bination of duration and frequency of 
trauma; if that were so, then it would be 
necessary to alter the strategy of our think- 
ing considerably, shifting to a study of just 
why only a narrow range of values was 
successful. To anticipate the results, it 
turned out that there is a wide range within 
which residua may be established. 

The hypothesis underlying Experiment II 
was that the strength of the residual is 
linearly related to the frequency and dura- 
tion of trauma. As will be seen, however, 
the hypothesis was only imperfectly tested 
by the design used; for duration there were 
three points but for frequency only two 
points. 


Procedure 


There were 40 Ss (20 males, 20 females) bred 
in the University of Oregon laboratories, The 
shock box and grid runway were used. Table 4 
outlines the experimental schedule for the five 
groups of eight Ss each. 

Weanling Treatment. Ss were kept in individual 
cages starting with Day 20 (weaning) and there- 
after maintained on ad lib. feeding and watering. 
After the weanling trauma period the animals 
were not handled until they reached 100 days of 
age, at which time they entered upon the adult 
testing phase of the experiment. 

Adult Testing. Testing consisted of five train- 
ing trials per day for 10 consecutive days on the 
grid runway. Escape testing consisted of the 
escape-type behaviors which occurred during the 
first five trials of avoidance training operations. 
No control groups were used though comparisons 


TABLE 4 
OUTLINE оғ SCHEDULE FOR EXPERIMENT П 
Exposure Number Age 
Group per day of of 
(in minutes) days subjects 
A 1 5 21-25 
B 2 5 21-25 
c 4 5 21-25 
D 2 10 21-30 
E 4 10 21-30 


with the behavior of control animals in other 
experiments in the series will be offered, 


Results 


The results for escape and avoidance are 
given in Table 5. 


Escape. An analysis of variance indicates 
that both duration of exposure per day and 
number of days are significant sources of 
variance (р < .05). Tukey's gap test sug- 
gests that the Ss with 1 minute of shock 
are superior in escape behavior to all others, 
but that whether animals receive 2 or 4 
minutes of shock per day does not matter. 
It is also evident that animals shocked for 
10 days do more poorly than those shocked 
for only 5 days, that is, have longer running 
times. 

Comparison of these results with those 
for the control animals in Experiment I 
(see Table 3, Avoidance, escape-type 
responses, Control group scores) makes it 
clear that the differences between 5- and 
10-day groups occur at a level which is 
lower than that for control groups. In other 
words, a residual effect is present so that 
both 5- and 10-day groups escape more 
slowly than control Ss. Comparisons may 
also be made with control data for the ex- 
periments reported below; they, too, will 
show that a residual effect is present in the 
behavior of these traumatized Ss. 


Incomplete Avoidance. The analysis of 
variance does not yield any significant 
sources of variance for the data in Table 5. 


Complete Avoidance. The analysis of 
variance indicates that there are no signifi- 
cant differences among the three groups 
whose data are given in Table 5. 


Conclusions 


Once again, it was demonstrated that 
there is a residual effect of weanling trauma 
upon adult escape and avoidance learning. 
In addition to differing among themselves, 
the Ss in this experiment do more poorly 
on escape than control animals in other 
studies. 


Duration. One minute of shock per day 
for 5 days does not produce a residual that 
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TABLE 5 


EFFECT UPON ADULT ESCAPE AND AVOIDANCE BEHAVIOR OF DIFFERENT DURATIONS 
AND FREQUENCIES OF WEANLING TRAUMA 
(Experiment II) 


Daily duration of shock trauma (minutes) 
Days of 
Test situation shock 
2 4 Mean 
Escape 
Mean running time in sec- 5 53732 7.35 7.20 6.76 
onds (Trials 1-5) 10 9.38 7.97 8.68 
M 5.73 8.36 7.59 
Avoidance 
Mean number incomplete 5 9.71 7.50 9.25 8.82 
(Trials 1-50) 10 7.78 10.38 9.08 
M 9:71 7.64 9.82 
Mean number complete 5 25.00 22.50 23.50 23.67 
(Trials 1-50) 10 22.00 20.00 21.00 
M 25.00 22.25 21.75 


* Differs at 5% level by Tukey's gap test from other four groups. 


appears in escape behavior. Whether it 
would if prolonged for 10 days we cannot 
now say. It is clear, however, that increasing 
the exposure does produce a residual, though 
there is no difference between 2 minutes as 
compared with 4 minutes. This differential 
effect occurs only for escape behavior. As 
far as avoidance behavior is concerned, 
merely being shocked seems to be the rele- 
vant factor; this can be seen by comparing 
the results for these Ss with the behavior 
of controls in Experiment I or any of the 
experiments below. 


Frequency. Ten days of shock produce 
slower escape times than 5 days. Avoidance 
behavior is not altered by increasing the 
number of days on which shock is given, 
though the absolute scores are in the direc- 
tion of lowered avoidances with increased 
number of days. 


Linearity Hypothesis. The data do not 
support such an hypothesis. A more reason- 
able hypothesis is that there is a quantum or 
threshold phenomenon: Once a particular 
duration and frequency of shock is reached, 
there are no variations in the relation be- 


tween escape and avoidance behavior as a 
function of increases in frequency or dura- 
tion of trauma. It should be kept in mind 
that these results are for a single, very 
high, shock level. It will be interesting to 
see what the results are for a replication of 
this design using lower shock levels. 

Differential Impact on Escape and Avoid- 
ance. Asin all the experiments in the series, 
whenever trauma has occurred, escape be- 
havior is less, and avoidance behavior 1s 
more, efficient. However, in this investiga- 
tion, it is also true that escape behavior 
shows an influence from differing durations 
and frequencies of shock while avoidance 
behavior does not. The two patterns of be- 
havior are apparently under different “con- 
trols.” 


Experiment III. Behavior Possibilities at 
the Time of Trauma 


While the effect of shock trauma upon 
animals might be so massive as to flavor there- 
after large areas of the animal's behavior, it 
need not be so. It is possible that there ате 
fairly uniform patterns of responding tO 


ia. чарала аа аа — 
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shock which become involved in some kind 
of emotional or instrumental learning even 
at the weanling stage and which are sub- 
sequently responsible for the apparent 
effects of shock. With this possibility in 
mind—it might be dubbed the learning view- 
point—there was an attempt to control di- 
rectly and indirectly some of the things that 
weanlings might do when they were shocked; 
the behavior of the Ss under the standard 
adult test conditions was then analyzed in 
relation to the early behavior possibilities 
made available to them by our experimental 
procedures. In all, there were eight different 
conditions to which Ss were exposed, only 
one condition, of course, to an S. 


Procedure 


There were 80 Ss (40 males, 40 females) from 
the Oregon colony. All animals were weaned at 
20 days of age and placed on an ad lib. feeding 
schedule in individual cages. The shock box, grid 
runway, and harness were used. There were 8 
groups of animals, As usual there were two 
phases, a weanling treatment and adult test. 


Weanling Treatment. From Day 21 to Day 25 
each S was subjected to one of the following 
treatments. 


1. Group E: (complete escape)—two subgroups. 


а. Eu. $s were trained to escape shock by 
running across 3' of charged grid to the un- 
charged goal box. The S was returned to its 
living cage after the correct response was made; 
there was a 3-minute intertrial interval. Total 
amount of exposure to shock was controlled by 
subjecting each S to that number of daily trials 
which produced a cumulative total of 2 minutes 
on the shock grid. 


b. Ем (maturational control). Because Ss 
were young there was reason to expect a strong 
maturational influence upon speed of running. 
Therefore, a separate group of six animals, when 
they were 25 days old, was given 1 day of escape 
training exactly as done with the main group. A 
comparison of the performance for the first day 
of learning of the 25-day-old animals with that 
for the first day of the 21-day-old animals would 
provide an index of the advantage attributable to 
maturational influences. 

2. Group E, (start box—no escape). Ss were 
exposed to electric shock only in the start box of 
the grid runway. No escape was possible, Each 
animal was matched with a littermate in Group E; 
whose length of exposure to shock per trial was 
duplicated. As in Group E, at the termination of 
each trial Ss were returned to their living cages, 
and trials were spaced 3 minutes apart. 


3. Group E; (start box—intermittent shock). 55 
in this group were also exposed to electric shock 
in the start box of the grid runway. Once again, 
each animal was matched with a littermate in 
Group E; whose length of exposure to shock per 
trial was duplicated. However, instead of being 
returned to its living cage following each trial, 
the E; S was required to remain in the start box 
for 20 seconds, following which the next trial 
began. 

4. Group Е, (start box—continuous shock). 55 
received 2 minutes of continuous shock per day in 
the start box of the grid runway. Each S, there- 
fore, received only one "trial" per day but a total 
duration of shock equal to that of the preceding 
groups. 

5. Group С, (handled). Each S was removed 
daily from its home cage and placed for 2 minutes 
in the start box of the runway but was not ex- 
posed to shock. 

6. Group Es (shock box—continuous shock). 
Ss were exposed to 2 minutes of continuous shock 
per day in a black box which contained no win- 
dows or any other light source. 

7. Group Es (harness—continuous shock). Ss 
received 2 minutes of continuous shock per day 
in a harness arrangement permitting little, if any, 
movement of the skeletal musculature. All four 
legs, as well as the abdomen, were secured to а 
nonconducting masonite board. Electrodes were 
attached to the forepaws of the S, so that the 
current necessarily traveled through the abdomen. 

8. Group С. (ignored). Ss were not removed 
from their cages at any time after weaning on 
Day 20 until testing was initiated 80 days later. 

Adult Testing in Grid Runway, Avoidance 
training for Ss in each of the eight groups com- 
menced at 100 days of age. Ss were given five 
training trials per day for 10 successive days. 
Trials were spaced 3 minutes apart. 


Results 


There are two sets of results, those for 
the learning of the E; group weanlings and 
for all animals during the adult tests. 


E, Weanling Learning. The main pur- 
pose of this group was to see whether early 
direct training on the task used for adult 
testing would have positive transfer value. 
Consequently, it is essential to know whether 
any learning occurs during the weanling 
period because, after all, the Ss are very 
young. 

The results are quite definite; weanling 
animals learn to run faster under escape 
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Fie. 2. Mean running times of escape group 
(Ex) and its control (Em). (Experiment III) 


conditions. Figure 2 shows the changes in 
running time over the 5 days of testing for 
the Ss of Ey, and Е,м. The performance of 
Eia for Day 25 is significantly superior to 
its performance on Day 21 by the Mann- 
Whitney U test. However, the performance 
of the maturation control group, Emm, on its 
first trial which took place on Day 25, is also 
superior to the first day performance of the 
Ем group, indicating that there is а clear- 
cut maturational effect at work; in this case, 
undoubtedly, a simple increase in ability to 
run swiftly. At the same time, the E;, group 
is superior on its last day of training to the 
first day of the Ем group, so it is safe to 
conclude that while maturational factors ac- 
count for a tremendous portion of the in- 
crease in speed for Day 25 over Day 21, it 
does not account for all of it; percentage- 
wise, the drop from 33 seconds on Day 1 
to about 5 seconds on Day 5 is about 79% 
maturational and about 21% the effects of 
the learning trials. 


We can be quite sure, therefore, that ^ 
the weanling animals of Eis learned an 
escape running habit. 

Adult Testing. The results are presented 
separately for escape and avoidance. 


1. Escape: Table 6 gives the group mean 
running times for escape responses on 
Trials 1-5 and 6-50; the data are ordered _ 
by running time and divided in accordance 
with the results of the Tukey “layering” test 
for Trial 1-5 (Ryan, 1959). It should be 
noted that the escape responses for Trials 
6-50 are based on markedly different num- 
bers of trials since such responses are inter- 
spersed among increasingly frequent avoid- 
ance-type responses (cf. Figure 1). How- 
ever, a previous study has shown that no 
relationship exists between escape times and 
the number of trials on which they are 
based after asymptotic levels are reached. 

The analysis of variance for Trials 1-5 
reveals infantile experience is a significant 
source of variance. The analysis of variance 
for Trials 6-50 does not show experience 
to be a significant source. In both analyses _ 
of variance sex was a significant source, with. 


TABLE 6 


Escarpe RUNNING TIME DURING ADULT TEST 
DIVIDED INTO Two CATEGORIES, FOR TRIALS 
1-5 OnLy, BASED ON TUKEY'S 
“LAYERING” TEST 
(Experiment III) 


| Trials 
Groups КИР 
Category | (weanling experience) 
1-5 | 6-50 
High Е; (escape training) 4.05 | 3.28 
running | С, (handled) 4.81 | 3:37 
speed | Ci (ignored) 4,83 | 3.27 
Low E; (shock box—con- 
running tinuous shock) 6.40 | 3.77 
speed E» (start box—no 
escape) 6.42 | 3.72 
Es (harness—con- 
tinuous shock) | 6.62 | 3.73 
E; (start box—inter- 
mittent shock) 6.63 | 4.09 
E, (start box—con- 
tinuous shock) 7.18 | 4.09 
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TABLE 7 


COMPLETE AVOIDANCE BEHAVIOR DURING ADULT Test, DIVIDED INTO Two CATEGORIES (MEAN 
FREQUENCY OF AVOIDANCE ONLY) ACCORDING To TUKEY’s “LAYERING” TEST 


(Experiment III) 


} Mean Mean trial 
Category Groups (weanling experience) avoidance until first 
frequency avoidance 

Most avoidances Е; (escape) 21.6 1152 

E; (start box—intermittent shock) 24.2 16.3 

E; (shock box—continuous shock) 23.0 17.3 

E: (start box—no escape) 22.7 16.5 

E; (start box—continuous shock) 22.5 20.2 

Least avoidances C» (ignored) 18.6 19.6 

Cı (handled) 18.5 19.6 

Es (harness—continuous shock) 18.3 19.8 


males being faster than females; such a 
difference does not exist in the one weanling 
group for which we have behavioral data, 
E. 

As the configuration resulting from the 
Tukey test indicates, animals who received 
direct escape training in infancy were, unlike 
other traumatized animals, able to escape as 
efficiently as control animals. The other 
traumatized animals had slow escape times. 
It is interesting to note that the results for 
Trials 6-50 follow the same ordering, 
though, because the analysis of variance did 
not show a significant F, the Tukey test has 
not been applied. 

2. Avoidance: Since Incomplete Avoid- 
ance (IA) scores showed no differences 
among groups for two measures—"mean 
IAs per group" and “mean trials until IA" 
—we do not present the IA data. 

These same measures were applied to 
Complete (or successful) Avoidances and 
the results are showr in Table 7, which has 
been organized in the same manner as Table 
6 for escape responses. Significant differ- 
ences for the early experience and sex 
Sources are found for mean number of 
avoidances only; while the mean trials until 
first avoidance are in the same direction, 
they do not yield a significant F. The con- 
figuration resulting from Tukey’s layering 
test is different from that for escape be- 
havior and we draw special attention to 


these differences because they are the basis 
for some of our conclusions concerning the 
nature of shock trauma. All traumatized S's 
do better than controls except for the 
harness group (Es); the control groups are 
joined by the animals restrained when 
traumatized in giving poor avoidance scores. 


Conclusions 


1. As in the previous two studies, trauma 
generally hinders adult escape learning and 
facilitates avoidance learning. 

2. However, direct training in the ultimate 
test device, if coupled with trauma, confers 
an advantage in both escape and avoidance 
learning. 

3. While it is true that trauma generally 
impairs escape and benefits avoidance be- 
havior, there are at least two conditions 
where it is not the case. Both are related to 
the behavior possibilities available to the .5 
at the time of trauma. Point 2 above indi- 
cates than when positive transfer training is 
involved, trauma is advantageous. At the 
other end of the scale, when no substantial 
instrumental behavior is permitted the Ss 
as in the case of the harness group, E,— 
then escape behavior is disadvantaged but, 
also, avoidance learning is impaired; re- 
strained animals who are shocked learn to 
avoid at the same slow rate as control 
animals who have never been shocked. 
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Experiment IV: Residua and Adult Be- 
havior under Dissimilar and Unstressful 
Test Conditions 


The most general finding of Experiment 
TII was that traumatic shock experience pro- 
duces a nonspecific residue although in- 
strumental conditioned responses may also 
be acquired in certain specific instances. If 
this nonspecific residue is thought of as an 
emotional change, a sensitization to external 
stimuli, it should influence a very wide range 
of behavior. In order to test this possibility, 
a closed-field test was chosen as the test 
situation in Experiment IV. 

Notice that in this, as well as subsequent 
experiments, shock treatment was carried 
out with the windowless black box, although 
from Experiment III it is evident that the 
details of the situation are not important, so 
long as S is not given an opportunity to 
escape the shock. 


Procedure 


Ss were 24 albino rats from the University of 
Oregon colony. All Ss were males. At 20 days 
of age each animal was weaned and placed in a 
separate cage, and assigned to one of three groups 
of eight animals, matched with respect to litter 
and weight. For the infantile treatment, the 
apparatus used was the shock box. For the adult 
test, the apparatus used was the closed field. 

Infantile Treatment. The experimental proce- 
dures used for infantile treatment were replica- 
tions of the procedures used for Groups Es, С, 
and C, of Experiment III. Thus, Ss were either 
shocked, handled, or ignored in infancy. 

Adult Test. From 100 through 109 days of age 
the three groups of rats used in this experiment 
were exposed to the closed-field test situation. 
Each S was given a single daily test trial of 10 
minutes. The number of spaces traversed was 
recorded separately for each minute of the trial. 
At the conclusion of the trial, after S had been 
returned to its living cage, the number of fecal 
boluses and the number of spots of urination were 
recorded, after which the apparatus was cleaned 
with a sponge. To minimize handling, each 5 was 
transported to and from the test room in its living 
cage. The ad lib. feeding and watering conditions 
under which Ss had been maintained since wean- 
ing were not changed during the course of adult 
testing. 


Results 


Activity. In Figure 3a intertrial activity 
of each group is presented. Here each point 
represents the mean number of spaces 


traversed in a single, daily 10-minute test 
session. 

Figure 3b shows group intratrial activity. 
Each point in this case represents the mean 
number of spaces traversed during Minutes 
1, 2, 3, etc., when scores are summed over 
trials. 

By inspection it is evident that in both, 
graphs the individual curves have an over 
all negative slope. When dichotomized 
scores for each S were used, 18 out of 24 
Ss were more active during Trials 1-5% 
(p < .05 by means of the Sign test—Siegel, 
1956). Likewise, the total number of 
squares traversed by each animal on Minutes 
1-5 was greater than during Minutes 6-10 
for 20 out of the 24 Ss (p < .01 by means 
of the Sign test). 

It appears, then, that activity decreases as 
a function of exposure to the closed-field 
situation; this reduction can be observed 
within a given trial, but also occurs between. 
trials spaced 24 hours apart. These data 
closely parallel those collected by other ex- 
perimenters (Berlyne, 1955; Welker, 1956) 
on exploration. 

Analyses of variance performed on the 
activity data indicate that no group differ- 
ences exist with regard to activity levels 
early or late in the test trial or early or late 
in the series of trials. 
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Fic. 3. Activity of shocked, handled, and ig- 
nored groups. (3a—upper—Intertrial ; 3b—lower— 
Intratrial Experiment IV) d 


RESIDUA OF SHOCK-TRAUMA IN THE WHITE RAT 17 


Urination. The total number of urine 
spots over 10 days observed for the shocked 
group was 25, for the handled group 24, 
and for the ignored group 32. The Kruskal- 
Wallis H test for nonparametric data 
(Siegel, 1956) indicated no differences be- 
tween conditions. 


rats did not defecate at all during testing in 

_ the open field. Of the eight Ss who did 
defecate, three had previously received 
shock, three had previously received han- 
dling, and two had been ignored. The 
Kruskal-Wallis H test indicated no signifi- 
cant differences between conditions. 


P Defecation. Sixteen out of the total 24 


к: Conclusions 


3 1. The results of Experiment IV were 
“Aconclusively negative: neither intense elec- 
tric shock nor handling in infancy affected 

subsequent behavior in the closed field. 

Furthermore, the data presented on urina- 

tion and defecation discourage further use 
*. of these measures as indicators of moderate 
degrees, of emotionality. Response fre- 
juency was simply too low to provide for 
adequate determination of individual differ- 
ences, As has been previously suggested 
(Hunt & Otis, 1953), the Hall tests may be 
reliable only when anxiety level is relatively 
high. 

2. These negative findings reduce the 
scope of the hypothesis concerning non- 
specific residua : intense stimulation admin- 
stered in infancy clearly does not influence 
he total range of adult behavior in rats. 

3. Similarly, the consequences of handling 
may be limited, although once again our 
handled Ss received less total early experi- 
ence than Ss used in other experiments. 


iy 
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I Experiment V: Residua and Adult Behavior 
under Similar, Stressful, and Unstressful 
- Test Conditions 


Experiment IIT showed that Ss may dis- 
play at least some of the effects of traumatic 
shock even when no instrumental behavior 
is permitted during treatment (cf. harness 
roup, Ев). However, it is clear irom Ex- 
eriment IV that the residua induced by 


shock is not a pervasive or highly general 
effect. There are two possible ways in 
which the residue might operate in adults: 
(a) it might be closely keyed to the occur- 
rence of shock, in which case shock might 
serve as a drive or cue factor; (b) there 
might be an environmental feature of the 
trauma situation which acquires either drive 
or cue value. 

In Experiment V, therefore, both treat- 
ment and test conditions were varied in such 
a way as to (a) maximize or minimize 
transfer from treatment to test, (b) com- 
pare different varieties of original stress 
(electric shock vs. cold), (c) compare adult 
performance with and without stress, and 
(d) compare the effects upon activity and 
response thresholds. 


Procedure 


Ss were 60 Sprague-Dawley rats from the 
University of New Brunswick colony. Ss were 
weaned at 20 days of age and placed on an ad lib. 
feeding schedule in individual cages. The shock 
box, the “white” closed-field box with a grid, the 
cold water apparatus, and the harness were used, 

Treatment. From Day 31 to Day 40 each S was 
exposed to one of the following treatments. 

1. Group B (shock box—black). Ss were ex- 
posed to electric shock 4 minutes per day for 10 
consecutive days. The Ss in this group were 
treated identically with those of Group Es, Experi- 
ment II. 

2. Group W (white box). Ss were exposed to 
electric shock in identical fashion to those in 
Group B, except that treatment occurred in the 
white closed-field apparatus instead of the black 
shock box. 

3. Group CW (cold water). 5s were exposed 
to 37°F. water 10 minutes per day for 10 consecu- 
tive days. 

4 Group Н (harness). Ss in this group re- 
ceived treatment identical to that of Group Es, 
Experiment III. 

5. Group C (handled). Ss were placed in the 
black shock box each day of the treatment period 
for 4 minutes, but were not subjected to shock. 

6. Group C: (ignored). Ss were neither handled 
nor shocked at any time after weaning until 100 
days of age, when adult testing began. 

Adult Testing. Commencing at 100 days of age, 
Ss from each of the six groups received 6 days 
of testing. The tests were of three types: (a) ob- 
servation of activity in the white shock box under 
no-shock test conditions, (b) observation of activ- 
ity in the white shock box while exposed to 1.25 
milliamperes of shock, (c) determination of re- 


18 к. H. BROOKSHIRE, К. А. LITTMAN, Ax» С. N. STEWART 


TABLE 8 ii 
SCHEDULE OF TESTING FOR SUBJECTS OF 
EXPERIMENT V = 
Presence Length of 
Test Apparatus of session 
day shock (in minutes) A 
{ 
1 White box No 5 ^ 
2 White box No 5 
3 White box Yes 5 
4 White box No 5 
5 Black box Yes Indeterminate жы К 
6 Black box Yes Indeterminate We Se ме Shock зма м Shock 
Fic. 4. Activity as a function of test conditions, 7 


sponse threshold in the black shock box by the (Experiment ND 

method of limits. Table 8 shows the exact testing ү 

schedule. escape and avoidance under shock, these mitigating 
On Testing Days 1-4, activity was measured, as responses should be expected to reappear, and, 

in Experiment IV, by counting the number of depending upon the task, either facilitate or hinder $ 

G-inch squares traversed by S during the session. escape and avoidance. Р 
On Testing Days 5 and 6, response thresholds The other hypothesis is that there is a general 

were measured, using the technique described by change in reactivity as a result of shock. In that 

Kimble (1055). Two observers judged the re- event, when shocked, the experimental groups 

sponses of each S as falling into three categories : which had previously experienced shock should 

“jump,” “flinch,” and "no response” (see Kimble all behave differently from “inexperienced” 

for a definition of these categories). The method groups. It should be noted that our design does Ё 

of limits was used. Both the ascending and de- not permit us to distinguish between these two $4 

scending series contained the following values (їп hypotheses; however, it does permit us to deter 

milliamperes) : .05, .10, .15, .20, 25, .30, 40, 50, mine if the general outcome they both predict 

60, .70, .80, .90, 1.00. Ss received one ascending different behavior under shock for traumatized 

and one descending series on each of the 2 days animals—is the case. 

of testing. The response threshold for each 

animal was taken as the mean threshold of the Results 

four runs. A "jump" threshold was also deter- К АТ. 

mined, using the same statistical procedures. Figure 4 shows the mean activity scores 
The thresholds were studied to test two hy- in the white box for Ss in each group as 8 

potheses. One may be called the hypothesis of function of testing sessions. A separate | i 


instrumental response learning during shock. It H ; e 
is possible that when animals are shocked they analysis of variance was performed on th 


acquire responses which mitigate the severity of data for each testing session. Table 9 is 
the shock. In that case, when tested later for summary of these analyses. 


TABLE 9 


SUMMARY OF ANALYSIS OF VARIANCE OF ACTIVITY IN WHITE SHOCK Box 
FOR ТЕзт Davs 1, 2, 3, AND 4 


(Experiment V) 
Source Day 1 Day 2 Day 3 Day 4 | 
(po 25 p p ” 
Treatment «.001 «.05 <.001 >.05 <.001 2.05 
Sx «.001 2.05 «.001 <.05 2.05 «m 
Interaction «.01 <.001 >.05 >.05 


а Tested against residual. 
b Tested against interaction. 


The Tukey test indicates that Groups B 
-. and W displayed significantly less activity 
- on Day 1, and that Groups B, W, and H 
И displayed significantly less activity on Day 
__ 3, when performance was measured under 
shock. The treatment source was not sta- 
tistically significant on Days 2 and 4. 

Table 10 shows the mean response and 
jump thresholds for the various treatment 
groups, as determined on Testing Days 5 and 
6. Analysis of variance indicated that treat- 
ment was mot a statistically significant 
source of variance for either measure. 


Conclusions 


l. Since both Groups B and W show 
"fear," that is, low activity, on Day 1, the 
relevant cue must be the grid bars, as that 
is the one cue each had in common. 

2. Since Groups B, W, and H all show 
relative inactivity under shock ( Day 3), the 
important factor for this effect must be the 
only common element during treatment: 
electric shock. 

. 3. The results for handling and exposure 
to cold are congruent with the field test 
tesults of Experiment IV; there is no indica- 
= tion that either treatment affects exploratory 
behavior. Hence, the effects appear to be 
specific to shock. 

4, The “fear” effect demonstrated on Day 
1 is easily masked. On Day 4, after all Ss 


i TABLE 10 
€ 


T MEAN Response AND Jump THRESHOLDS (IN 
. MILLIAMPEREs) By МЕТНОр or Limits on 
TESTING Days 5 AND 6 IN THE 


Brack Ѕноск Box 
(Experiment V) 


Group | Response threshold | Jump threshold 
Cold -165 .282 
White .142 -285 
Black 155 .274 
Harness .178 +291 
Handled .183 :318 
Ignored .225 .298 

-175 -291 
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had been exposed to shock the previous day, 
no significant differences for treatments 
were found. 


5. The negative response threshold data 
indicate that is is unlikely that differences 
in adult behavior may be explained in terms 
of a responsivity notion, If there are 
relatively permanent changes of this sort 
resulting from trauma they are either more 
subtle or different in kind from those 
measured by Kimble’s procedures. 


Experiment VI: Residua and Adult Be- 
havior under Nonshock Stress—Mild and 
Severe Hunger 


In Experiment VI the possibility that 
intense infantile stimulation might have gen- 
eral effects was studied. The negative results 
of Experiments IV and V might be at- 
tributable to the lack of reliability or validity 
of the test measures. Further, the closed- 
field test is typically conducted in the 
absence of any known motivation (except 
exploratory drive), and it could be that the 
effect would exhibit itself only under testing 
conditions of more substantial motivation. 
The work of Malmo (1957) with humans 
suggests this possibility; his studies with 
psychiatric patients suffering from patho- 
logical anxiety indicated that the resting 
levels of physiological function of patients 
usually could not be distinguished from 
normal Ss. Rather, differences existed only 
when Ss had been exposed to stressful situa- 
tions. Thus, in the present experiment, Ss 
were required at the time of testing to 
traverse an elevated runway for food under 
either a mild or severe hunger drive. It was 
assumed that hunger was distressing to the 
Ss, yet both the "stressor" and the test 
situation would be totally different from 
those used in the earlier treatment of the 
animal. 

Incidentally, the weight gains of each 
group were determined, in order to attempt 
to replicate the transient weight differential 
reported by Scott (1955) ; the present ex- 
periment, however, used a shock level four 
times as intense. 
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Procedure 


Sixty albino rats from the University of Oregon 
colony were divided into three groups of 20 each. 
Each group contained an equal number of males 
and females, matched with respect to litter affilia- 
tion and weight. Ss were weaned at 20 days of 
age and placed in separate cages. For the infantile 
treatment, the windowless black shock box was 


again used to shock infant Ss. For the adult 
test, the runway apparatus was used. 


Infantile Treatment. Ss in Experiment VI were 
either shocked, handled, or ignored during infancy 
(from 21 through 25 days of age). The experi- 
mental procedures used for infantile treatment 
were replications of those used in Experiment IV, 
except that all animals were weighed at 20, 25, 
50, and 85 days of age. In one sense, then, the 
Ignored group was not truly ignored, although 
weighing was accomplished without handling by 
the E. (S was weighed in a removable cage, and 
the weight of the latter was assumed to be con- 
stant within the accuracy of our data.) 


Adult Test. At 85 days of age, shocked, han- 
dled, and ignored groups all were removed from 
ad lib. feeding and given access to food for only 
1 hour per day. At 100 days of age, Ss from each 
group were divided equally and assigned to one 
of two subexperiments : 

1. Experiment VIa. Beginning on Day 100, Ss 
were given one training trial per day on the 
elevated maze. Reward was 2 grams of wet mash. 
Following this single training trial, the animal was 
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returned to its living cage and permitted an hour’s 
free feeding. Each day, then, S was trained under 
a 23-hour hunger drive. 

2. Experiment VIs. Ss were also given one 
training trial per day on the elevated runway 
beginning with Day 100, and the same reward was 
used. However, these animals were not permitted 
an hour's daily free feeding and thus were ex- 
posed to gradual starvation. 


The performance measure used in both sub- 
experiments was running time. Ss were weighed 
daily during the course of the experiment, im- 
mediately before the training trial. 


Results 


Weight Changes. Table 11 indicates the 
mean body weight of shocked, handled, and 
ignored Ss as a function of age and de- 
privation schedule. Using analysis of vari- 
ance, a significant difference between con- 
ditions was obtained only at age 25 days. 
It appears that the shock retarded weight 
gain during the treatment period, but that 
compensation occurred during the succeed- 
ing 2 months. The mean weight of the 
Handled group, although slightly larger than 
the Ignored group at each weighing, never 
achieved statistical significance. It may be 
that the trend simply represents the con- 


TABLE 11 
MEAN WEIGHTS OF GROUPS AT VARIOUS AGES 
(Experiment VI) 
Age 
(in days) Shocked Handled Ignored 

20 40.5 40.5 40.3 

25 64.4 71.6 71.5 

50 189.0 198.9 190.9 

85 293.7 298.1 288.4 

Subexperiment Subexperiment Subexperiment 
A B A B A B 
7. er 

100 239.4 238.4 243.8 244.7 238.0 240.4 
101 236.2 224.9 241.1 233.1 236.8 228.6 
102 232.8 213.7 238.8 221.1 231.5 216.6 
103 229.5 204.1 235.1 210.1 229.2 206.3 
104 226.3 191.7 231.8 198.5 226.9 195.3 
105 223.5 180.6 230.1 186.3 221.0 184.6 
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tinuation of a small group difference at the 
beginning of the experiment. These results 
confirm. those reported by Scott (1955) 
using a much lower shock level, but, with 
regard to the effects of handling, do not sub- 
stantially support the data of Weininger 
(1956), McClelland (1956), and Bernstein 
(1952). However, it must be pointed out 
that the handling procedures employed in 
these latter experiments were much more 
elaborate and extended over a much longer 
period of time than the treatment to which 
our animals were exposed. 

It is apparent from the table that both 
deprivation procedures produce striking and 
consistent weight losses in all groups. 
Furthermore, the severe deprivation to 
which Ss in Experiment VI, were subjected 
accelerated weight loss—and, indeed, caused 
death in some Ss by Day 106. Although 
analysis of variance yielded negative results 
for the infantile treatment factor, sex and 
deprivation schedule were highly significant. 


Runway Latencies. Experiment VI,. In 
Table 12 are presented running times for 
Shocked, handled, and ignored S's under 23- 
hour food deprivation. It is evident that 
time scores diminish rapidly and consistently 
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for all groups of Ss as a function of train- 
ing. The Kruskal-Wallis H test, performed 
on the latency scores for Day 1, indicates 
that the ignored Ss are greatly inferior to 
either handled or shocked groups (р < .05). 
No significant differences were obtained for 
other training days. 

Experiment VIp. Table 12 shows the 
time scores for experimental groups under 
the more severe deprivation procedure. This 
procedure resulted in death by starvation for 
all Ss. The earliest deaths occurred on Day 
7 of testing (106 days of age). Thus, 
latency data are presented only for those 
trials when the total sample was tested, By 
means of the Kruskal-Wallis H test, it was 
found that, as in Experiment УТ, the per- 
formance of the Ignored group was signifi- 
cantly inferior to that of shocked or 
handled Ss on Day 1. Since the statistically 
significant inferiority of ignored animals on 
Day 1 has been obtained in two separate 
cases, we may reject the null hypothesis with 
some degree of confidence despite having 
capitalized on chance by performing indi- 
vidual tests on each training day. The in- 
feriority of the Ignored group once again 
diminished markedly on further training 


TABLE 12 
MEAN RUNNING Time (IN SECONDS) FOR EACH EXPERIMENTAL GROUP AS A 
FuNcTION ОЕ TRAINING 
(Experiment VI) 
Shocked Handled Ignored 
Trial Subexperiment Subexperiment Subexperiment 
A B A B A B 
1 152.2 148.2 167.5 137.0 263.8 278.4 
2 255 43.7 63.1 52.3 90.2 67.8 
3 45.8 15:4 36.7 19.9 80.3 27.1 
4 34.9 10.5 38.3 14.5 44.3 17.6 
5 21.8 1575 30.0 10.7 27.7 10.7 
6 16.6 17.8 15.4 8.7 17.4 9.0 
7 12.1 9.8 11.9 
8 11.5 10.1 11.4 
9 11.2 10.1 10.3 
10 10.5 9.6 10.1 
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trials. By Training Day 4 all groups were 
performing efficiently. However, on Days 
5 and 6 the following occurred: the shocked 
Ss began making inferior goal responses. 
These responses did not appear to be the 
result of debility; Ss in all groups responded 
vigorously on all tests trials until approxi- 
mately 12 hours before death. The response 
inferiority of the Shock group produced sta- 
tistical significance on Day 6 but not on Day 
5, using the Kruskal-Wallis H test. How- 
ever, since an individual statistical analysis 
was performed for each training day, we 
have capitalized on chance, and a replica- 
tion of this portion of the experiment is in 
order. 


Age and Weight at Time of Death by 
Starvation. The frequency of deaths and 
weight at death in each group was studied. 
Differences between treatment groups were 
not statistically significant for either meas- 
ure. 


Conclusions 


1. The data on the relative weight of Ss 
shocked, handled, or ignored substantially 
support those reported by Scott (1955): 
Ss shocked in infancy gained less weight 
during the period of treatment than non- 
shocked Ss; however, this difference was 
only transient and cannot be reproduced in 
adulthood even under conditions of depriva- 
tion. Thus, if intense stimulation in infancy 
produces somatic changes, these changes are 
not reflected by body weight. 


2. Several experimenters (Levine & 
Otis, 1958; Weininger, 1953). have reported 
that rats handled early in life show signifi- 
cantly less mortality following food and 
water deprivation. The data presented in 
this experiment do not substantiate these 
findings nor do they suggest that Ss shocked 


in infancy are more or less viable than con- 
trols. 


3. Handling was observed to have a bene- 
ficial effect upon running times to obtain 
food at least on Trial 1, but group differences 
tended to disappear on subsequent trials. 
Thus, the combined data of this and the 
preceding experiments indicate that the par- 


ticular handling procedures employed in this 
study had relatively little residual effect upon 
adult behavior. We have already offered an 
hypothesis to account for our failure to con- 
firm previous results in this area. 


4. The latency data of this experiment 
were quite interesting and need to be repli- 
cated. They suggest that nonspecific residua 
resulting from infantial trauma operate on 
behavior only under conditions which in- 
volve a substantial amount of stress; the 
performance of the Shocked group was im- 
paired only after Ss had been severely de- 
prived for several days. Coupled with the 
positive results of Experiment III, where 
the dependent variable was escape-avoidance 
behavior under intense shock, and the 
negative results of Experiment IV, where 
the dependent variable was exploration, we 
have placed some limits on the behavioral 
generality of the consequences of traumatic 
experience, 


Discussion 


The foregoing studies provide an over- 
whelming demonstration of the enduring 
effects of electric shock trauma. In one 
sense, of course, such findings are not par- 
ticularly novel. There is an enormous 
literature dealing with the effects of shock 
upon such acts as bar pressing, maze run- 
ning, exploratory behavior, and shuttlebox 
jumping. But the present results are dis- 
tinctive because they involve such a long 
time between traumatization and testing. To 
be sure, there have been a number of studies 
dealing with the retention of CRs over €x 
tended periods. The most noteworthy of 
these undoubtedly are the classical invest 
gations of Liddell on experimental neurosis 
(see, for example, Liddle, James, & Ander- 
son, 1934, where the 2-year retention of a 
flexion response was demonstrated); aga?) , 
Wendt (1937) has shown the retention of 
an avoidance flexion for 2.5 years. k 

There is every reason to believe that 
there are similar, if not identical, mecha- 
nisms underlying the short- and long-term | 
effects of shock. However, there is 016 
feature of the present work which distn- ^ 
guishes it from that of, say, Liddell or * 
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Wendt: that is the relative infrequency of 
trauma and lack of significant resemblance 
to testing conditions—save for the shock, of 
course, and certain special training condi- 
tions, as in Experiment TII; it is rather like 
the traumatic events which are the subject 
of clinical speculation and theorizing. In- 
vestigations like those of Liddell or Wendt 
have the clearly structured aspects of con- 
ditioning operations in which cue and rein- 
forcement are systematically related. Our 
own operations involve an undifferentiated 
traumatic condition wherein the shock and 
cue factors are related rather unsystemati- 
cally, and where the long-term changes in be- 
havior are dependent not only upon the 
nature of the test situation but even appear 
to be unrelated, in part, to the behavior of 
S during original treatment. 

We shall review the major findings and 
deal with a number of questions and prob- 
lems which they raise. 


Previous Experience and Early Experience 


The development of a long lasting effect 
of shock is, of course, the main "finding." 
It has been shown in Experiment I that 
this effect—which we have called a residual 
—is indifferently the same whether it is 
based upon traumatizing a weanling or an 
adult. In this sense, the results are only in- 
directly relevant to problems of develop- 
ment. It is, however, an object lesson for 
the problem we raised in the introduction, 
viz., the distinction between previous ex- 
perience and early experience. At least with 
regard to the effect of shock, it appears that 
the effects of early trauma and previous 
trauma are the same. 

Such a generalization has to be qualified, 
however, especially as it might be applied 
to developmental phenomena. Our Ss were 
traumatized as weanlings and studied as 
relatively young adults. It would be dan- 
Serous to assume that the effects of shock 
given prior to weaning, say at 5, 10, or 15 
days of age, would be the same as that 
administered subsequent to weaning. Like 
most rodents, the rat matures very swiftly. 
Thus the consequences of severe shock could 


reasonably be expected to differ even for so 
short a time as 10 days. 

What might these differences be like? 
There are at least two rather different (but 
not incompatible) possibilities. The simplest 
outcome is that there would be no effect; at 
some period prior to weaning the Ss may 
simply be too young to establish a residual 
with consequences for problem solving be- 
havior; Denenberg (1958) presents evidence 
on the conditionability of infant rats which 
supports this. The other possibility is that 
the effect of shock might be so destructive 
that the ordinary capacities of the S s 
are permanently changed. It is easy enough 
to speculate about what these effects might 
be like; there are ample grounds for expect- 
ing permanent central nervous system and 
endocrine changes. It is more difficult to 
know in detail what the alterations might be 
like. Overriding these problems, though, is 
the question of knowing what are the be- 
havioral correlates of particular damage 
changes. It is here where evidence is mini- 
mal and the need so great, 

At the other end of the development 
Picture is the question of senescence. The 
deterioration of capacities with age is a most 
complex problem and it is impossible to pre- 
dict on the basis of gerontological research 
what to expect about the fate of residua over 
an interval of, say, 900 days as compared 
with the period of 100 that we used, where 
the longer period covers the growth and 
decline of the organism. It is clearly an 
investigation that is needed. 

Another limitation to keep in mind is 
that our results may hold only for electric 
shock. For example, in Experiment V we 
obtained no indication that exposing wean- 
lings to extreme cold produced any residua. 
Nevertheless, there is an accumulating body 
of evidence to show that frequent handling 


5 The work of Denenberg and Levine indicate 
that for both the rat and the mouse this is very 
likely the case. In several studies which are now 
being prepared for publication, R. W. Leary of 
the University of Oregon has shown that there 
are, indeed, substantial behavior differences be- 
tween animals traumatized at 5, 10, and 20 days 
of age when tested in the runway apparatus used 
in these studies. 
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at an early age produces residua which 
manifest themselves in a variety of ways 
later in life. On the other hand, the work of 
Ader (1959) suggests that the age at which 
the experience occurs may be a crucial factor 
since he found по differences between 
animals who were handled beginning with 
age 23 vs. 136 days. Consequently, there 
appear to be developmental thresholds for 
the effects of different sorts of experiences, 
and it is reasonable to suppose that these 
differ for different kinds of experience. 

So we are able to advance only a rather 
limited generalization: there is no difference 
in the effects of a shock over a 100-day in- 
terval whether the shock is given at 21 or 
121 days of age. 


Interval between Trauma and Test 


While we have shown that the age at 
which trauma occurs is irrelevant to the 
operation of residua over an interval of 100 
days, there remains the matter of the length 
of the interval between trauma and test. 
We have only one bit of evidence on this 
question, and this comes from the same ex- 
periment on which the preceding discussion 
was based, Experiment I. While the long- 
interval animals did not differ from one 
another they did differ from the short-in- 
terval animals; those Ss who were trauma- 
tized at 121 days and tested at 125 did bet- 
ter on escape than the remaining experi- 
mental animals and all control groups. 

Jt is difficult to account for this outcome, 
though there are several possibilities which 
suggest themselves. First, of course, is the 
possibility that this is a chance outcome; we 
hope to answer that by a replication. How- 
ever, it is our belief that the magnitude of 
the superiority is too great, exceeding even 
the control level, to be attributable to chance 
fluctuation; this is, at any rate, the assump- 
tion underlying the remainder of the discus- 
sion. A more interesting possibility is that 
there is some latent feature of the residua 
which requires a lengthy interval of time to 
mature; if Ss are tested after a shorter 
period, then the usual effect of shock upon 
escape behavior does not appear. This sug- 
gestion suffers from the fact that it does 


not appear able to account for the superi- 
ority of the immediately tested animals 
relative to control animals; at most it ac- 
counts only for superiority over other 
traumatized animals. 

There is a variant of this approach which 
is more suggestive that stems from the work 
of Griswold and Gray (1957). They found 
that electroconvulsive shock reduced the 
susceptibility of rats to the lethal effects of 
the Noble tumbling drum (1943); this was 
in addition to the demonstration that 
tumbling itself, in small doses, built up 
almost complete resistance to extended 
trauma. In our own experiments, it is 
reasonable to assume that the traumatization 
operations built up some experiences with 
shock so that if it were again to be met, as in 
the alley runway, a relatively organized re- 
sponse to an otherwise disabling experience 
could emerge quickly. To adopt this hy- 
pothesis, it is necessary to assume that the 
modulating effects of recent experience 
disappear with time. So long as it lasts its 
effect is to diminish the drive level induced 
by shock, a kind of "grin and bear it" 
phenomenon. If such an interpretation is 
warranted then the old suggestion that emo- 
tional and high drive states reduce the range 
of available cues that Ss use (Easterbrook, 
1959) can be put to work; the effect of 
recent traumatization is to adapt the 
organism in such a way that the drive level 
induced by shock is lower than for animals 
traumatized long ago or who have never 
had any previous shock experiences. The 
shock experienced under escape conditions 
does not disorganize recently shocked 
animals so much, that is, their drive level is 
not so high, and they are better able to dis- 
criminate and use environmental cues lead- 
ing to safety. 

As is true of all ad hoc explanations, the 
present suggestion has its troubles, ће main 
one being the fact that the immediate 
animals are not also superior to more dis- 
tantly traumatized Ss at avoidance. 

Another possibility, suggested by the 1°- 
sults of Experiment V, is that the superior 
performance of Ss tested for escape be- 
havior immediately after exposure to sho 
represents the transitory presence of a CoM” 
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ditioned running Tesponse to the UCS, a 
running response which is “lost” over a 
period of 100 days, leaving only our nor- 
mally observed residue—which, in the run- 
way apparatus, is manifested in longer escape 
times and superior avoidance behavior. This 
explanation is satisfactory, however, only if 
it can be determined that the so-called 
"normal" residua represent something other 
than a learned response to shock, since two 
incompatible, learned responses cannot be 
used to account for disparate observations 
stemming from the same antecedent condi- 
tions ! 

In any event, the data suggest that the 
long- and short-term consequences of shock 
may be different. Whether this results from 
the operation of latent or maturing mecha- 
nisms or transient conditions cannot now be 
determined. 


Adaptive Consequences of Trauma 


One of the notions underlying the interest 
in the effects of early experience is the pos- 
sibility that certain experiences may prove 
to be benign or malignant. This would mean 
that independently of the kind of problem 
subsequently confronting the S, the effect of 
having had a certain kind of experience 
would invariably produce “good” or “poor” 
behavior, On the face of it, such a notion 
would not appear to be a good one because 
the measures of satisfactory performance 
are so arbitrary for any activity that it is 
always possible to conceive of measures 
Which place a premium or value on slower, 
hesitant, or erroneous outcomes. Hence, on 
purely analytic grounds, an expectation that 
there will be universally deleterious conse- 
quences from an experience, independent of 
the measures to be used, cannot be vindi- 
cated. : 

The outcomes of the present set of in- 
vestigations support this logical point. In 
Seneral we have found that trauma produces 
inefficient escape behavior—where a lengthy 
interval between it and test occurs—and 
ficient avoidance behavior relative to con- 
trol animals. Tt is obviously the same set of 
experiences which does this- so there must 
be ‘something: about them: which, depending 
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upon the conditions at the time of testing, 
will produce fast or slow, good or poor be- 
havior. The job of understanding the effects 
of previous or early experience should there- 
fore consist of working out the properties of 
trauma-induced residua, the properties of 
the test conditions, and the laws connecting 
them. Needless to say, the extent to which 
this can now be done theoretically—as con- 
trasted with inductive generalizations—is 
sharply limited. 


Varying the Treatment Parameters 


The question of the specific conditions 
necessary at time of treatment for the ap- 
pearance, and for the systematic variation, 
of long-term, trauma-induced residua has 
been explored in two ways. First, the shock- 
ing schedule has been altered in order to 
determine whether the length of a single 
exposure or frequency of exposure is re- 
lated to the magnitude of the adult changes. 
Second, by varying the response possibilities 
at the time of treatment we attempted to 
learn something about the instrumental 
nature of the residua. 


Varying the Trawmatization Schedule. 
The effects of varying the frequency, 
strength, or duration of shock upon specific 
operants or respondents have been explored 
by a number of investigators. In all cases 
there is evidence for a relationship, though 
the exact form is by no means clear. 
Kimble's (1955) data suggest an inverse 
relation between response latency and shock 
strength, whereas Brush (1957) and others 
have obtained results which suggest that the 
relationship is nonmonotonic. In these in- 
vestigations, however, the shock was ad- 
ministered in contingent relationship to a 
response which had to be learned or had 
been learned, the data consisting of varia- 
tions in the rate of acquisition or extinction, 
The present series of studies involve a shock 
administered before there is a learning as- 
signment, though it is obvious that the usual 
responses to shock are clearly related to the 
running response which we used as а 
measure. For that reason, indeed, discover- 
ing this relationship might be said to be one 


of our main objectives.- Qus ee 
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In any case, the data establish a relation- 
ship between both shock duration and fre- 
quency for the schedules used: Exposure to 
the apparatus without shock (handling con- 
dition) cannot ordinarily be detected in adult 
behavior, and greater amounts of shock per 
session or greater numbers of sessions in- 
crease the magnitude of the adult effect, at 
least within the parametric limits of this 
investigation. Notice, however, that the em- 
pirical demonstration of this relationship 
can be used to support a large number of 
hypotheses concerning the mature of shock 
residua, so that these results may be infor- 
mative and even supportive, but not dis- 
criminative in regard to theory. They simply 
indicate further that, whatever explanation 
is offered, it must be an explanation which 
assumes the residua to be affected by some 
or all of the conditions present when 5 is 
shocked at 21 days of age, and which allows 
for variation of shock exposure to affect the 
magnitude of later behavioral changes. 

On the other hand, the distribution of 
shock during treatment does not appear to 
affect the residua. In Experiment IIT, Ss 
given continuous (E,) or intermittent shock 
(Es) during a single session, and Ss given 
short “trials” of shock with removal from 
the apparatus following each trial (E2), 
fared about the same in the adult testing 
situation. Thus, although the total amount 
of shock per treatment is related to future 
performance, variations in the way that S 
is exposed to a constant amount of shock 
(with all other parameters, such as treat- 
ment apparatus, controlled) do not appear to 
be significant. This, of course, argues 
strongly against an explanation of the per- 
formance of these groups based on a simple 
version of instrumental learning theory. 

Providing Ss with Different Behavior or 
Response Possibilities. The various groups 
involved in Experiment III give us a reason- 
ably coherent picture of some things which 
occur during traumatization. One of the 
most interesting findings fits squarely into 
the picture as almost any learning theory 
would see it. The Ss who were given direct 
escape practice showed marked improvement 
in escape running as pups and when tested 
as adults were superior to all other trauma 


training groups in both escape and avoid- 
ance. While the question of how the escape 
running is learned may be treated differently 
by learning theories, there should be agree- 
ment about its consequences; all should pre- 
dict the positive transfer to mature behavior. 
Insofar as there is such a clear-cut case of 
positive transfer—more accurately, con- 
tinued training after a lengthy interruption 
—there is no problem. Nevertheless, the fact 
that transfer does occur, indicates clearly 
that there must be some common cue ele- 
ments involved; whether these are all en- 
vironmental stimuli or whether some of them 
may be based upon shock effects or other 
endogenous conditions is not clear. One of 
the cues that is clearly involved in the escape 
practice group that does not hold for other 
groups, is the raising of the door between 
the start box and the alley, An increase in 
running speed requires the animal to begin 
moving as soon as the door goes up. How- 
ever, whether there is also independent per- 
ceptual learning about shock (for example, 
discriminative stimuli), is not known. 

In any event, the role of direct positive 
transfer is unequivocally demonstrated. 
What now about the other groups in Expert- 
ment IIT? For these groups, no provision 
for learning an instrumental response was 
made. However, despite marked variations 
in their weanling treatment schedules, they 
did not differ as adults. Since they were 
superior to nonshocked controls in avoid- 
ance it is inescapable that the residua are 
not only the products of instrumental learn- 
ing such as shown by Group E,. This 15 
reinforced by the fact that while E; was 
superior in both escape and avoidance, the 
other shocked animals were superior only 
in avoidance and were, indeed, inferior 1 
escape. i 

Further, the behavior of Ss in Experi 
ments III and V who were exposed to shock 
while strapped in a harness argues strongly 
for the independence of at least a part of 
the residual from cue and response poss! 
bilities when trauma occurred. At the time 
of trauma, these groups could make no overt 
movement (except of the head) and had no 
useful visual or tactual cues for behavior 12 
the later testing situations. Yet, in Expert 


2—— 
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ment III, the harness group displayed the 
typical pattern of inefficient escape behavior, 
and in Experiment V it showed lowered 
activity in the presence of shock. Clearly, 
these behavioral effects do not depend upon 
similarity of neutral cues between treat- 
ment and test, and they do not lend them- 
selves to any explanation based upon the 
acquisition of an instrumental response.? 

The only necessary stimulus appears to 
be the unconditioned stimulus, the electric 
shock itself. This “pure shock effect" ex- 
plains the independence of activity and 
shock-escape behavior for the harness and 
control groups; furthermore, it explains the 
absence of a relationship between escape 
and avoidance behavior in the same situation 
for most experimental groups. 

There are, then, two sources of residua 
so far: direct training (probably involving 
the visual cue of the escape door being 
raised) and the shock itself. Are there any 
other sources? Since the harness groups 
show the effect of shock only in escape, 
while other nondirect training groups also 
show it in avoidance there is every reason to 
expect another source to be present. What 
it is emerges from Experiments III, IV, and 


In Experiment V, 55 shocked in the black 
shock box and tested in the closed-field 
maze having a grid showed the typical “fear” 
reaction of depressed activity; in contrast, 
the Ss in Experiment ТУ exposed to iden- 
tical treatment and test conditions except 
that the closed field had no grid, could not 
be differentiated from controls. It appears, 
then, that weanlings acquire a fear of the 
grid bars. Confirmation of such an acquired 
drive comes from Experiment III, where, 
again, the black shock box group displayed 
“superior” avoidance behavior compared 
with the harness group or controls. That 
the fear is a drive, in the conventional sense, 
and not an instrumental response is easily 
seen from a comparison of the black box 


— 
ê The only possibility is that there was a “no 
response,” freezing reaction learned. However, 
Since the response thresholds for the harness 
Sroups (Table 10) do not differ from those for 
the other groups, this possibility must be rejected. 


groups in Experiments III and V. In Ex- 
periment II performance under no-shock 
(avoidance) conditions is enhanced, whereas 
in Experiment V performance under no- 
Shock (activity) conditions is depressed. 

Tt appears, then, that we have at least 
three residua operating in these experiments. 
To the "instrumental habit" resulting from 
direct training and the "pure shock" effect 
exhibited by the harness groups, we must 
now add a "learned drive" of fear which is 
conditioned to the grid bars appearing in 
both trauma and testing devices, 


Varying the Test Parameters 


Obviously, the previous discussion of 
variations in cue and response possibilities 
at the time of treatment necessarily indi- 
cated some important points regarding the 
effects of varying test parameters. It made 
clear that two of the residua, instrumental 
habit and learned drive, depend upon the 
similarity of cues between treatment and 
test. Thus, marked variations in the test 
parameters that preclude such similarity, 
would necessarily destroy the behavioral 
effects of those two residua. The results of 
Experiment IV show this to be the case. 

The pure shock variable has been con- 
structed from the observations of the harness 
group of Experiments IIT and V. In both 
these experiments it should be noted that 
modifications of behavior following shock in 
the harness occur only in the presence of 
shock at time of testing. Thus, harness Ss 
do not differ from control Ss in avoidance 
behavior (Experiment IIT) and they do not 
differ from control Ss in activity under no- 
Shock testing conditions (Experiment V). 
Hence, the effects of the pure shock variable 
would appear to be observable only in the 
presence of shock. 

That this is not so, however, will be sgen 
from a consideration of Experiment VI. 
When Ss were tested on an elevated runway 
to food under mild deprivation, there were 
no differences between experimental and 
control groups. However, when run under 
“severe” hunger, experimental Ss ran more 
slowly than controls Ss. 
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Since treatment and test conditions were 
the same in Experiments IV and VI except 
for the level of the hunger drive we must 
assume that this latter condition is the factor 
responsible for the different outcomes of 
these two experiments. Thus it can be seen 
that to demonstrate the effects of early 
shock-trauma upon adult behavior, there 
can be wide variations in test apparatus, 
variations which preclude the use of learned 
cues. From Experiment VI we may draw 
the tentative hypothesis that the only req- 
uisite testing condition is high drive level. 
Ss may be motivated by shock or by hunger, 
and if it is sufficiently intense, the effects of 
the “риге shock variable" will be evident. 


What Are Trauma Induced Residua? 


In the introductory section the term “те- 
sidua" was proposed as the name for the 
effects of trauma. This neutral usage was 
adopted. because it was not clear when these 
investigations were begun what sort of 
phenomena would emerge. The title of this 
paper indicates our resolve, even after the 
analysis of six experiments, to adhere to 
the neutral usage of the term. The plural 
form is used to express the conviction that 
there are a number of factors rather than 
a single one and to provide room for the 
possibility that these factors may even be 
completely independent of one another in 
their mode of operation. 

In an area so little explored yet so rich 
with allusive terms, theorizing is, of course, 
dangerous. Nevertheless, in order to guide 
the discussion, we have introduced three 
constructs to denote different aspects of the 
relationship between early shock-trauma and 
adult behavior. We turn now to a brief con- 
sideration’ of each. "ors 

Pure Shock Variable. The pure shock 
variable was postulated to result simply from 
exposure to intense shock and to manifest 
itself in inefficient behavior when Ss are 
tested under stress. Thus, the relationships 
between trauma and later shock-escape be- 
havior, activity under shock, and locomotion 
toward food under extreme deprivation are 
accounted for. Contrariwise, activity and 
locomotion toward food under-"unstressful" 


conditions were not affected by previous 
trauma. 

We are still left with the problem of pro- 
viding insight into what such a thing as pure 
shock effect might be. We have far too little 
information as yet to be confident about 
any suggestion. Yet, for what it may be 
worth, we believe that the pure shock effect 
is closely related to the "activation" theory 
of Malmo. 

In several theoretical articles, Malmo 
(1957, 1958) has suggested that the “in- 
tensity" of behavior may be considered as a 
separate and theoretically identifiable di- 
mension. There is an extensive experimental 
literature to support such a view. Intra- 
individual covariation of EEG, GSR, 
muscle-potential gradient, heart rate, blood 
pressure, etc., has been demonstrated under 
stress and these indices are known to be 
related to excellence of performance in the 
form of an inverted U-shaped curve 
(Bartoshuk, 1955; Stennett, 1957). Further- 
more, Malmo and Shagass (1949, 1952) 
have found that psychiatric patients diag- 
nosed as displaying "pathological anxiety" 
show higher levels of physiological reaction 
to stress than nonpatients, so that individual 
differences in sensitivity to “arousal” per- 
haps are identifiable with the clinical defini- 
tion of anxiety. 

Malmo has suggested that permanent 


modifications in relation to stress may be „ 


the result of “keeping level of arousal very 
high over long periods of time.” Keeping 
this in mind, it is possible that the pure 
shock effect observed in this series of studies 
represents an experimental demonstration 0 
a change in what has been termed sensitivity 
to arousal. The pure shock effect, it will be 
remembered, was observed in every expert 
mental situation used, so long as the organ 
ism was at the same time under stressful 
conditions. The correspondence is obvious, 
although in this report there is no evidence 
to substantiate physiological changes in in- 
tensity corresponding to those reported by 
Malmo. 


Malmo’s suggestion that changes in re- 


action to stress may occur following %4- | 


BET. ) 


tended periods of high level of arousal was, {сё 


of course, not tested in our exper! ents. 
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Rather, the treatment periods were quite 
short though involving very intense stimula- 
tion. It is possible that the severity of the 
stimulus acted as a substitute for prolonged 
exposure to a less intense stimulus. 

The correspondence of the present results 
to those dealing with level of arousal is cer- 
fainly not established, but it represents a 
particularly interesting possibility, not only 
because it would be a demonstration of in- 
terspecies generalization but also because it 
would point up the value of attacking a prob- 
lem from both ends (treatment, as we did; 
“effect,” as Malmo did). 


Learned Fear. A drive, conditioned to 
the grid bars in the shock apparatus, was 
postulated to account for the efficient avoid- 
ance behavior of experimental Ss in Experi- 
ments T, IT, and TIT and the "freezing" be- 
havior displayed on the open-field grid in 
Experiment V. We believe this drive is the 
Same phenomenon which Miller (1959) has 
studied so extensively. There seems little 
reason to doubt now that a fear drive may 
be conditioned to visual or tactual cues. 
While it is true that most previous investi- 
gations of such learned fears have worked 
with short-term effects, they should be at 
least as long-lasting as other learned habits. 
Tn view of the particularly great resistance 
of fear to extinction found by Solomon and 
Wynn (1953), it seems reasonable to as- 
sume that a fear response conditioned to the 
Shock grid bars їп infancy may be redinte- 
grated upon being exposed to such bars as 
an adult. 


Instrumental Learning. Under specific 
and sharply definable treatment conditions 
an adjustive response of escaping shock may 
be learned. In Experiment III this learning 
took place in Group E, at a very early age. 

gain, as in the case of the learned fear 
drive, there is no reason to doubt that such 
learning will be retained for long periods of 
time. Hence, this third residual factor ap- 
Pears to be a reasonable one to postulate. 


Comparison with Other Studies 


Because of differences in procedures and 
| Sqüipment, the present studies are relatively 
independent of other work on traumatic or 


early experience. Our presentation has 
tended to emphasize the separation. There 
are, however, two features of work in this 
area to which we can relate our work; they 
are shock and handling. 

Studies Using Electric Shock as the 
Critical Variable. Rats which had been 
shocked while young, when tested as adults 
by Scott (1955) displayed mild but con- 
sistent differences in a number of situations. 
He also found that weight increases were 
slowed down in experimental animals dur- 
ing the trauma period. The present studies 
substantiate this latter finding with the fol- 
lowing additional feature: The loss is 
transitory since animals recover weight by 
maturity and cannot be distinguished from 
nonshock controls even when subjected to 
severe hunger. We could partially confirm 
his behavioral data; our experimental S's 
also showed greater emotionality, that is, 
low activity, than controls when the adult 
test situation possessed features similar to 
those of the trauma device, but lowered 
activity was not found in a “strange but 
nonpainful situation." “Nonpainful” is the 
key term here; experimental Ss exposed to 
a strange but painful situation as adults 
reacted much less to the pain than controls, 

Levine, Chevalier, and Korchin ( 1956) 
used treatments similar to those of Scott in 
order to study adult avoidance learning. 
Rats were either shocked, handled, or 
ignored during infancy (1 to 20 days of 
age). When all S's were tested at 60 days 
of age for learning efficiency at a hurdle- 
jumping avoidance task, the Es found that 
the Shocked group was significantly in- 
ferior to the Handled group, and that the 
Ignored group, in turn, was inferior to the 
Shocked group. In regard to the superior 
avoidance performance of shocked Ss, the 
present set of experiments confirmed the 
Levine et al. data. But handling early in 
life apparently had little effect on adult be- 
havior (except for the initial response to 
food in Experiment VI). Failure to repli- 
cate the Levine results for handled Ss may 
be attributed to either of two differences in 
procedure: (а) compared to Levine’s rats, 
our Ss were handled for shorter periods of 
time—5 or 10 days compared with 20; (5) 
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the age at which handling occurred was 
different in the two studies. Levine's rats 
were handled prior to weaning; our's were 
invariably handled following weaning. 
Therefore, there are no necessary contra- 
dictions in the two sets of data, although, 
together, they suggest limitations in the 
effectiveness of handling. 

Baron, Brookshire, and Littman (1957) 
have submitted evidence showing that when 
Ss traumatized immediately after weaning 
are exposed later to shock-escape or shock- 
avoidance test situations where the CR is a 
lever press, they are consistently superior to 
controls, making more efficient escape re- 
sponses and a larger number of avoidances. 
Now, traumatized Ss in the present study 
did make more avoidance responses than 
controls, but they were grossly inefficient at 
escaping shock. These findings appear to 
contradict those of Baron et al. (1957). Such 
a contradiction may simply reflect the differ- 
ences in procedure and equipment. Trauma- 
tized Ss ordinarily run less when they are 
subsequently shocked; hence, the poorer 
escape performance in the present series is 
a logical outgrowth of the lowered reactivity 
of traumatized Ss to shock relative to naive 
Ss. Where a lever press is the relevant 
response, prior exposure to shock should 
make an animal less susceptible to being dis- 
organized by the recurrence of shock. 
Hence, it is always important to consider 
the nature of the behavior being studied, 
rather than some arbitrary characteristic, 
such as latency or speed. 

Griffiths and Stringer (1952) found that 
rats shocked in infancy were not different 
from controls when successively tested on 
a Warner-Warden maze, a modified Lashley 
discrimination apparatus, the Hall open-field 
test, and for susceptibility to sound-induced 
convulsions. These “negative” findings once 
again do not contradict other studies, in- 
cluding the present one, where differences 
were observed. The Griffiths and Stringer 
experiment utilized a relatively low shock 
intensity for treatment and confounded the 
results through successive testing of the 
same experimental Ss, a procedural flaw 
which has been discussed in the introduction 
to this report. 


Finally, Denenberg and Bell (1960) have 
located critical periods in mice for the 
effects of infantile shock on adult avoid- 
ance learning. Mice shocked prior to wean- 
ing differed from controls in avoidance be- 
havior depending upon the age of traumati- 
zation and the magnitude of the adult UCS. 
These data are only in partial accord with 
those of the present study, for some of the 
Denenberg and Bell experimental groups 
were inferior to controls. However, their 
results extend the information on the effects 
of early traumatic stimulation by suggesting 
that though the age of the S is not a critical 
variable after weaning, it may very well be 
a critical variable prior to weaning, during 
the very rapid and distinctive period of 
maturation in the rodent. 


Studies Using “Handling” as the Critical 
Variable. Recently, a number of studies 
have indicated а relationship between 
handling, or gentling, early in life and later 
viability (Bovard, 1958) and learning and 
emotionality (Denenberg & Bell, 1960; 
Levine, et al., 1956; Scott, 1955). These 
studies have indicated further that handling 
is an effective independent variable only 
during a critical period early in the life of 
the organism. The present set of experi- 
ments, although not aimed directly at the 
problem of handling, supports this position. 
Ss handled after weaning in Experiments 
II, III, IV, V, and VI were not different 
from nonhandled controls in behavior and 
viability tests. 


SUMMARY AND CONCLUSIONS 


Tn a series of six experiments, albino rats 
were exposed to a variety of traumatic and 
nontraumatic experiences to determine some 
of the parameters which are critical for the 
relationship between trauma and later be- 
havior. The over-all results provided a con- 
vincing demonstration that trauma in the 
form of intense electric shock does modify 
future behavior. More specific findings 
were: 


1. Age of traumatization is not related to 


its effects, at least if the treatment occurs 
after weaning (20 days of age). In this 


| 
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respect the study does not substantiate the 
general hypothesis of the "critical period." 


2. Exposure to extreme cold or handling 
does not effect Ss in the same way that 
electric shock does, at least within the limits 
of this set of experiments. 


3. Coincidental with the behavioral 
changes created by shock, there are transi- 
tory changes in body weight. 


4. The "residua" of trauma are probably 
plural, that is, there is more than one change 
Which may take place in the organism. We 
have tentatively labeled these changes in- 
Strumental habit, acquired fear, and pure 
shock effect. Each depends on different 
antecedent conditions and each yields some- 
what different behavioral consequences. 


5. The consequences of trauma may be 
adaptive or nonadaptive depending upon the 
nature of the test situation and are reflec- 


tions of the three mechanisms we have 
postulated. 


6. Certain modifications of behavio: 
created by prior electric shock are remark- 
ably broad, appearing in all test situations 
used in this study where drive level is high. 
Other changes, however, which may be 
explained by conventional learning theory, 
are relatively narrow. 

Although the results generally support 
and extend those reported by other experi- 
menters on the relationship between trauma 
and later behavior, the successive and inte- 
grated nature of the six experiments pro- 
vided an opportunity for more detailed 
analysis of the critical variables. Tentative 
hypotheses, based on the results, showed 
the need for a rather complex theory to ex- 
plain what trauma does to the organism, 
but at the same time indicated the need for 
more empirical work. 
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or more than two thousand years men 

have concerned themselves with the 
question of how ideas become associated or 
connected to one another. One of the first 
to treat the subject systematically was 
Aristotle, who stated that there were three 
ways in which ideas could become associated, 
through similarity, contrast, or contiguity in 
time or space. Aristotle's laws passed rela- 
tively unchanged through two millennia to 
become in the eighteenth century the corner- 
stone for a philosophical school, British As- 
sociationism. The associationists, subjecting 
the problem to careful scrutiny, debated 
among themselves as to which principles 
were necessary to explain the “association 
of ideas." The extremes of this debate were 
represented by James Mill who thought 
only one principle, contiguity, was necessary, 
and by Thomas Brown, who added four 
Secondary principles to Aristotle's three 
primary principles. 

Currently, all three principles of associa- 
tion remain acceptable; however, modern psy- 
chologists are inclined to emphasize some 
form of contiguity as the most fundamental 
explanatory concept. Similarity and contrast 
appear to have their greatest appeal in those 
Situations where an association is observed 
between elements for which no contiguous 
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relationship can be established. For situa- 
tions such as these, proponents of the con- 
tiguity principle have introduced the notion 
of mediate association. In the broadest 
sense, association of the presumably inde- 
pendent elements is assumed to be mediated 
by associations between these elements and 
some common term or terms. Thus, while 
Elements A and B are independent, if they 
are associated in some manner (not usually 
clearly stated) with Element C, it is sup- 
posed that A and B will acquire some asso- 
ciative connection to each other. The mediate 
association concept has provided a theoretical 
basis for explaining generalization phe- 
nomena, and the present investigation deals 
with this application. The concern here is 
primarily with the importance of the tem- 
poral order of associating independent ele- 
ments with the common element and the 
direction of association between the inde- 
pendent elements and the common element 
in the verbal transfer situation. 


Mediating Associations 


Mediate association may be invoked to 
explain why two different stimuli elicit the 
same response (stimulus equivalence), why 
two different responses are evoked by the 
same stimulus (response equivalence), or 
why a stimulus and a response elicit each 
other without having been previously di- 
rectly associated (chaining). 

Several theories have been proposed which 
deal with the nature of mediated generaliza- 
tion, and though they differ somewhat as to 
the exact nature of the mediating process, 
they all stem from a series of papers by Hull 
(1930, 1931, and especially 1939). "Though 
not the first to deal with mediated general- 
ization, Hull is usually credited with making 
explicit the distinction between generaliza- 
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tion based upon partial stimulus identity, 
primary generalization, or irradiation, and 
secondary or mediated generalization (Hull, 
1939). Partial stimulus identity is self- 
explanatory; primary generalization is close- 
ly linked with classicial conditioning and is 
illustrated by the generalization found on a 
physical gradient surrounding a conditioned 
stimulus (CS). For example, if we con- 
dition a dog to salivate to a tone of 1,000 
cps he will also salivate, with a diminished 
amount, to all tones immediately surround- 
ing 1,000 cps (cf. Bass & Hull, 1934; 


Hovland, 1937; Pavlov, 1927). Secondary 
or mediated generalization is used to explain 
transfer across different physical dimensions 
and relies on response produced cues, de- 
veloped through the learning process, as 
mediating links. The argument developed 
by Hull is that each response produces its 
own characteristic proprioceptive stimuli 
which may then become associated with 
other responses. Thus, two stimuli asso- 
ciated with a common response may then 
be said to produce a common proprioceptive 
stimulus: 


Stimulus A——— Response B———» (stimulus consequences of B) 


Stimulus C———>Response B———» (stimulus consequences of B) 


It is further assumed that Response B 
can occur implicitly or fractionally and can 
produce implicit fractional stimulus conse- 
quences. Therefore, when one of the 


stimuli (e.g, A) is associated with a new 
response (D), this response is also asso- 
ciated with the stimulus consequences of 
the implicit response B: 


Stimulus A-—— — (implicit B)----(stimulus consequences of B) 


It is a simple matter to extend the model 
to other stimulus and response arrangements 
through the same general form of analysis. 
Most of the proposed mediational models 
now current have been derived from Hull's 
notion of secondary generalization. These 
models may be divided into two rather dis- 
tinctive types: (a) representational media- 
tion models such as those of Cofer and 
Foley (1942), Osgood (1952), and Mowrer 
(1954); (b) associational mediation models 
like that of Jenkins (1955) and Russell 
(1955). The former variety places a great 
deal of emphasis on the nature of the media- 
tion process, as opposed to its effect. The 
clearest formulation of this model is that of 
Osgood which is based on Hull's fractional 


Stagel: A (light) ——B (wink) 


tap on cheek 


` 


Response D 


goal response. It is asserted that the implicit 
mediating link in the process is a fractional 
response which is attached to both stimulus 
elements. The associative models, although 
not explicitly formulated, appear to require 
some sort of implicit verbal mediating link 
which, in some ways, is similar to the im- 
plicit verbal chains proposed by Skinner 
(1957). 

A frequently cited example of mediated 
association is that provided by the experi- 
ments of Shipley (1933, 1935) and Lums- 
daine (1939). Although a single paradigm 
is usually cited for this research actually 
two widely different paradigms were used. 
The so-called Shipley-Lumsdaine paradigm 
is ordinarily given as: 


Stage 2: B А a withdrawal) 


tap on cheek shock 


Test Stage: A (light)-——— (implicit B wink)————C (finger withdrawal) 


oe 
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This is a simple chaining paradigm with 
an implicit middle term functioning as a 
mediating link on the test trial. 


Stage 1: A (light) — —5B (wink) 


tap on cheek 


Shipley's (1935) paradigm is given as 
follows: 


Stage 2: C (buzzer) ——>B (wink) 


tap on cheek 


Stage 3: A (light) -———5 (implicit B)-———D (finger withdrawal) 


shock 


Test Stage: C (buzzer) --—-» (implicit B)----»D (finger withdrawal) 


This is a stimulus equivalence model, 
similar to Hull's, and is more complex than 
the chaining paradigm in that B never acts 
explicitly as a stimulus, serving only as an 
implicit response-stimulus unit in the third 
stage, where it acquires an association with 
D, and again serving implicitly in the test 
stage, where it evokes D as a response. 

These models provide a basis for explain- 
ing certain generalization phenomena but 


l L ш 
Stage 1: 
Stage 2: 


Iv y. MI 


A— —35B В——ЭС B——>A СЭВ АВ C———3B B-—_»A Вс 
B———5C А———›В С———В B——>A C———3B A———39B B—— 3C B———3A 


suggest several additional paradigms as well. 
Investigation of all of these paradigms 
would seem to be necessary to provide an 
adequate basis for evaluating the “explana- 
tions” which invoke mediate association. 
Combining word pairs in such a way that 
mediated generalization may be tested in a 
three stage paired-associate learning prob- 
lem yields eight possible paradigms. 


VII Vill 


Test Stage: A——»C Ас Ас А———С A—3C АЭС АС A——_ 5 


Fic. 1. The eight paradigms. 


Paradigms I, IT, IIT, and IV are “chain- 
ing models” suggested by the Shipley-Lums- 
daine work, V and VI are “acquired stimu- 
lus equivalence models” suggested by the 
work of Hull, while VII and VIII are re- 
ferred to as “acquired response equivalence” 
models. Although Paradigms I, V, and VII 
have been treated by several investigations, 
the remaining ones have scarcely been 
Studied at all. The present study is con- 
cerned with the extent to which generaliza- 
tion occurs in these eight paradigms as well 
as the adequacy with which an associative 
model accounts for such effects. 


Review of the Literature 


The earliest studies on mediate association 
were done by Scripture (1892), Smith 
(1894), Howe (1894), and Atherton and 
Washburn (1912). The results of these in- 
vestigations were conflicting. In addition, 
the experimental situations were highly 
variable, the materials used were not con- 
stant, and they were all oriented, primarily, 
at discovering the “level of consciousness” 
at which the mediation, if any, took place. 
Thus these early experiments shed little 
light on the mediational process as we are 
concerned with it here. 


4 DAVID L. HORTON anp PAUL M. KJELDERGAARD 


Modern experimental investigations of 
mediational models go back at least to the 
work of Shipley and Lumsdaine cited above. 
Their research, of course, dealt with non- 
verbal processes although the paradigms 
have more typically been tested with verbal 
materials using the paired-associate learning 
technique. In this situation the subject is 
asked to learn pairs of words. The left- 
hand terms serve as stimuli and the right- 
hand terms as responses. Several lists are 
learned in which common terms are ar- 
ranged in accord with the paradigm being 
tested. The explanation of generalization or 
facilitation in such cases is made on the basis 
of mediate association. It is assumed that 
the common term occurs implicitly between 
the elements used during the test stage and 
thus facilitates the learning process. 

The first systematic test of any of the 
paradigms used in this investigation was by 
Peters (1935), who ran a series of nine 
experiments using several of the paradigms. 
Although he treated all of his experiments 
as if the paradigm involved were the same, 
in essence he tested Paradigms I, V, and 
VII. He has been credited with testing 
Paradigm VI also, but in the experiment 
referred to, the stimuli are not discrimi- 
nable from responses in both learning stages, 
thus one cannot specify which of the four 
stimulus and response equivalence para- 
digms was involved here.? 

Results from Peters' experiments were 
mixed. Even where he found positive re- 
sults, the same paradigm and procedure with 
slightly different materials was not likely to 
confirm his previous finding. This was true 
for all of the paradigms that he tested. 
Where facilitation did occur Peters dis- 
covered that the majority of mediation re- 
sponses were produced by a few subjects. 


?'The experiment referred to here is Peters' ex- 
periment Number 7. The stimuli and responses to 
be associated in the learning stages involved two 
figures, the larger circumscribing the smaller, pre- 
sented to the subjects simultaneously. Unless one 
can assume that when the subject was told to asso- 
ciate the larger with the smaller that the subject 
then perceived the larger as a stimulus and the 
smaller as a response, there is no way of determin- 
ing which paradigm is involved here, only that it is 
either stimulus or response equivalence. 


This factor suggests some sort of individual 
parameter which might be a function of a 
variable such as the amount of overlearning 
in Stages 1 and 2. 

Of the methodological difficulties involved 
in Peters’ experiments, the most serious 
seems to be the nature of his test stage. In 
six of the nine experiments, Peters depended 
upon a recall test. Since the learning in 
Stages 1 and 2 was carried only to a rela- 
tively low level, such a test might have been 
insufficiently sensitive to detect mediational 
bonds which were present. Further, in sev- 
eral of these experiments recall was tested, 
not immediately following the learning 
phases, but 24 hours later, again decreasing 
the likelihood of detecting a mediational 
effect. 

In the three experiments in which Peters 
used a “trials and errors to criterion” test 
in the third stage, the results were negative 
for Paradigms I and VII and significant in 
the wrong direction for the indeterminable 
paradigm (see Footnote 2). It has been 
shown that generalization effects are most 
prominent in the early phases of the test 
trials (Haagen, 1943; Osgood, 1948; Under- 
wood, 1951); thus, this too provides an in- 
sensitive test of mediated generalization. 

To sum up: Although Peters tested at 
least three and perhaps four of the para- 
digms discussed earlier, due to the variety 
of procedures and experimental materials, 
direct comparisons are possible only for 
Paradigms I with V and I with VII. Here, 
lack of a sensitive test stage and negative 
results leave the findings in question. Fur- 
ther, the unreliability of the finding involv- 
ing the same paradigm and similar method- 
ology, and the lack of statistical tests for 
most experiments leave the significance of 
all the Peters experiments in question. Cer- 
tainly, the presence of mediational effects 15 
strongly suggested in some cases. 

Irwin (1951) performed one of the few 
experiments using a chaining paradigm other 
than the Shipley-Lumsdaine model (our 
Paradigm IT): 


Stage 1: B——>C 
Stage2: A——>B 
Stage 3: A———>C 


ASSOCIATIVE FACTORS IN MEDIATED GENERALIZATION : 


Facilitation was found in this situation. It 
can be seen that this paradigm is the same 
as the classic chaining model except that the 
order of Stages 1 and 2 has been reversed. 
Generalization in this case can be mediated 
via the implicit B term occurring during the 
test trial or by the implicit occurrence of C 
as a response to B in Stage 2 or by both of 
these processes. 

Bugelski and Scharlock (1952), using 
nonsense syllables, demonstrated facilitation 
with the simple chaining paradigm used by 
Peters. The mediational effect occurred even 
though the subjects did not report deliberate 
use of the common term as a memorizing de- 
vice. In this experiment, similar to Peters’, 
the paired-associate technique was used and 
all associations were formed within the con- 
text of the experiment. 

A rather interesting demonstration of 
chaining, involving two links, was presented 
by Russell and Storms (1955). They used 
associative chains taken from word associa- 
tion tables (Russell & Jenkins, 1954) as 
the first stage of their experiment. Chains 


of the B——95C———D variety were selected 
I 
Assumed; A——>B 


Stage 1: 
Test Stage: A———>C 


Generalization was obtained with Paradigm 
VII, but not with Paradigm I. Mink hy- 
pothesized that the impilicit B in the test 
Stage of Paradigm I offered the subject a 
chance to make a discrimination not possible 
in Paradigm VII. When paired-associate 
learning is used for all three stages and a 
different motor response is paired with each 
stimulus, facilitation is obtained with both 
paradigms, and the chaining model shows a 
greater transfer effect (Jeffrey, 1957). Thus, 
Mink's findings may be restricted to the par- 
ticular situation in which they were obtained. 

Jeffrey and Kaplan (1957), using non- 
sense syllables in a paired-associate learning 
fask, tested Paradigm VII and obtained posi- 
tive results. Later Jeffrey (1957), pairing 
nonsense syllables with six motor responses, 
tested both Paradigms I and VII and found 


B——>C (motor response) 


from the tables (eg. Justice——>Peace 
—— War). The study was restricted tc 
those chains where D was never given as a 
free response to B and vice versa. A non- 
sense syllable was used as a stimulus to 
evoke the B term, which then presumably 
set off the chain leading to D. The paradigm 
was as follows: 


Assumed; B——»C——3p 
Stage 1: A——>B 
Test Stage: A——>D 


If the C——9D link is viewed as a single 
term the paradigm is the same as that used 
by Irwin and the demonstrated facilitation 
can be explained in the same way. 

Mink (1957) compared Paradigms I and 
VII. He assumed the first stage, on the 
basis of response frequencies in the Russell- 
Jenkins norms (1954) for the Kent- 
Rosanoff Word Association Test, and used 
a single undifferentiated motor response in 
the learning stage. The final stage tested 
generalization of the motor response to as- 
sociates of the words presented in the learn- 
ing stage. The paradigms tested were: 


VII 


B— >A 
B——>C (motor response) 
А———эС 


significant generalization for both para- 
digms. The amount of generalization for 
Paradigm I was significantly greater than 
for VII for one subgroup; no differences 
between paradigms were found in the other 
subgroups. These findings might be recon- 
ciled with those of Mink's if one assumes 
that the different tasks, i.e., learning to as- 
sociate a specific response with each stimu- 
lus versus learning to discriminate between 
sets of stimuli to which a single response is 
to be made or not made, give rise to differ- 
ent response sets and thus to different 
strategies. 

As one reviews the literature on the eight 
paradigms outlined above, one is impressed 
with the inequitable use of the various para- 
digms in research. Some paradigms are 
used frequently, others not at all. This may 
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be due to the investigator's treating all of 
the paradigms as if they were alike and then 
choosing one or a few as a matter of con- 
venience. This explanation is particularly 
plausible for the early experimenters such 
as Peters, who seems to have done just that. 
Another possible explanation is that the re- 
searcher, after performing an analysis of 
the possible occurrence of implicit responses 
or associations, has selected for study only 
those paradigms which seem most likely to 
produce "good" implicit chains. It may be 
that some of the paradigms have been 
ignored because they seemed unlikely to 
elicit алу mediating responses. However, if 
one admits the possibility of bidirectional or 
backward associations being formed in 
paired-associate learning, then there is some 
possibility of mediational effects in all of 
the paradigms. There is evidence which 
seems to warrant careful consideration of 
this possibility. This evidence stems from 
four major sources. Russell (1955) re- 
viewed the work done at Minnesota dealing 
specifically with this problem. Evidence was 
cited from such varied sources as clustering 
during recall (cf. Jenkins & Russell, 1952), 
tachistoscopic recognition time (O'Neil, 
1953), paired-associate learning, etc. Much 
of this work involved the use of stimuli and 
responses from the Minnesota norms for the 
Kent-Rosanoff words for which the forward 
and reverse strengths of the word pairs are 
known; however, even when the stimulus 
and response terms were nonsense symbols, 
most of the above experimental conditions 
yielded evidence favoring bidirectional as- 
sociative links in interpretation. 

Storms (1957, 1958) designed a series of 
experiments to investigate alternative ex- 
planations of "apparent backward associa- 
tion." In one of these experiments he com- 
pared Paradigm I with Paradigm VIT, using 
nonsense syllables as experimental materials, 
and found no evidence for either forward or 
backward facilitation. А second experiment 
involving only Paradigm VI, using as its 
experimental materials word pairs selected 
from the Minnesota norms so they had a 
moderate forward ( A———B) strength, but 
virtually no detectable reverse (B——>A) 
strength, showed strong generalization 


effects for both forward and backward pairs. 

In a different type of experiment, Storms 
demonstrated that the mere presentation 
of a response member of a low strength 
S——>R chain can greatly boost, at least 
temporarily, the apparent strength of this 
S—>R pair. This study was cited in 
support of a proposed "recency factor" ex- 
planation. Storms feels that this recency 
factor can account for much of what is pre- 
sumed to be "backward association" in liter- 
ature, i.e., “backward association” effects are 
the result of forward associations which are 
normally at low strength, but have received 
a temporary boost due to the response term 
having recently been presented. He does not 
conclude, however, that recency can com- 
pletely account for the phenomenon. 

Murdock (1956, 1958), in two experi- 
ments, obtained data which support the view 
that backward association is in fact a real 
phenomenon, not accounted for by chance 
connections or mere recency effects. Using 
the transference and interference paradigms, 
he found both transference and interference, 
forward and backward, where predicted. In 
each case, the difference between the amount 
of facilitation or of interference between 
forward and backward pairs favored those 
in a forward direction, but these differences 
were small and statistically not significant. 
Forward and backward pairs differed signifi- 
cantly from the controls for both the inter- 
ference and the transfer paradigms. 

Jantz and Underwood (1958) used a 
paired-associate learning task, in which non- 
sense syllables (stimuli) were paired with 
adjectives (response), to determine the in- 
fluence of the number of S——>R expo- 
sures and the association value of the non- 
sense syllables on the formation of backward 
(R——S) associations. Two tests were 
utilized: First, after a given number of 
S——>R presentations, the subjects were 
given the R term (adjectives) and asked to 
recall the S words (nonsense syllables). 
Secondly, the same subjects were then given 
the R—>S pairs on a memory drum and 
required to learn them to a criterion of one 
perfect trial. Both the frequency with which 
the R word elicited the correct S word in 
the recall task and the amount of facilitation 
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found in the second stage of the transfer 
paradigm proved to be a joint function of 
the two independent variables, the number of 
S——>R pairings, and the association value 
of the nonsense syllables involved. Of par- 
ticular interest are these authors’ findings 
that the strength of both the S———9R and 
R—>S connections appear to be an asymp- 
totic function of the number of S——>R 
pairings and that the asymptote for the 
S——>R connections is considerably higher 
than the asymptote for the 8—5 con- 
nections. These findings are consonant with 
one of the theoretical assumptions to be put 
forth shortly. 

This brief review of the literature illus- 
trates several points. First, there is evidence 
for mediated generalization in a wide variety 
of verbal experiments. Second, there is 
enough evidence for the concept of bidirec- 
tionality of associational links that bidirec- 
tionality cannot be dismissed as an artifact 
of the experimental procedures or the organ- 
ism's learning history. Third, there is a 
dearth of comparisons between the above 
illustrated paradigms even though such com- 
parisons would bear directly on the issue of 
the bidirectionality of associations, and have 
important general theoretical implications for 
mediated generalization. 


Analysis of the Paradigms 


One additional point is apparent from the 
literature, namely, that the majority of in- 
vestigators in this area have failed to ex- 
ploit fully the possibility of implicit associa- 
tions occurring prior to the test stage; there 
does not, however, seem to be any justifica- 
tion for such a restriction. Irrespective of 
the type of implicit responses that one talks 
about, it appears that the process should 
operate in all instances of the same task. 
Consider the situation in which learning is 
confined to the experimental setting. When 
the subject learns the first pair of elements 
no additional associations are likely to be 
Strengthened. This ignores, of course, asso- 
ciations to the stimulus or response elements 
which occur irrespective of the familiarity 
of the materials, But during Stage 2, in all 
paradigms, the formation of other links is 


possible. Theoretical precedent, for this as- 
sumption, was established in the third stage 
of the Shipley (1935) model, and it is a 
simple matter to extend the process to the 
second stage of the three-stage paradigm. 
For example, the process can be viewed by 
reference to Figure 2. 

Now, in addition to mediation via the 
familiar, implicit B term during the test 
stage, another association (indicated by the 
dotted line) is suggested. The effect of this 
linkage, established during Stage 2, can be 
viewed in at least two ways. First, since 
both the processes separately facilitate the 
learning of A——>C pairs, during the test 
stage the combined effect could increase this 
facilitation. However, at the start of the test 
trials, the A——>C and A——>B links 
could be considered as competing tendencies 
which would delay the learning process. The 
essential point to be made here is that con- 
sideration of such linkage provides a basis 
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Fic. 2. Possible mediators in the Shipley-Lums- 
daine paradigm. 
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for explaining different generalization ef- 
fects between paradigms which would not 
otherwise be possible. To illustrate the above 
point, the eight paradigms will be analyzed 
in terms of possible mediating links which 
could be active in generalization if it should 
take place. The symbols to be used are as 
follows: 
5 indicates a forward explicit association 
operating in the learned direction 
indicates an association operating in the 
opposite direction from. which it was learned 
—this will be called a reverse association 
indicates a presumed implicit response 
indicates a presumed implicit forward asso- 
ciation between two terms 
«= indicates a presumed implicit reverse asso- 
ciation between two terms 


<—— 


Co) 


---ә 


The eight paradigms to be analyzed may be 
conceptualized as indicated in Figure 2. 

The symbols and the analysis of the para- 
digms presented in Figure 3 can be clarified 
by examining Paradigms 1 and V. The first 
stage in both cases involves, simply, overt 
paired-associate learning (A——>B). The 
second stage, in addition to the overt learn- 
ing, presumably provides the opportunity for 
implicit associations to an implicit response 
which occurs as a result of the first stage 
learning. In Paradigm I, the implicit re- 
sponse (A) is assumed to occur whenever 
the overt Stimulus B appears. This, pre- 
sumably, is the result of the formation of a 
reverse association (A¢——B) during the 
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Frc. 3. An analysis of possible mediators in 
three-stage paired-associate learning. 


DAVID L. HORTON anv PAUL M. KJELDERGAARD 


R 


Stage 1 learning (A——>B). This implicit 
response or its stimulus consequences could —— 
then become attached to the overt Response du 
C, thus establishing an (A)---—€C associa- iE 
tion. In the test stage, the A——>C learning _ 
may be facilitated via the overt chain formed 
stepwise during Stages 1 and 2, or through 
the postulated implicit chain formed during 
the second stage, or both. In the second _ 
stage of Paradigm V, the implicit (A) із 
presumably elicited by B as a result of the 7 
reverse association (A4— B) formed аше E 
ing ће A——>B learning of Stage 1. The 
Stimulus C may then become attached to the AE 
implicit response, (A), resulting ina C2 
(A) bond. If learning a C---—(A) associa- _ 
tion results in the formation of an A---3C 
connection, then facilitation or generalization 
should be demonstrated in the test stage. In 
addition, the A——9B connections and the — 
СЭВ links formed in the learning stages — 
may have some facilitative effects in the test E 
stage of Paradigm V. A 
The importance of this kind of analysis 1 
may be illustrated by contrasting Paradigms J 
Т and II. These two situations reveal опе - 
essential difference, namely the nature ofthe d 
А-——-—эС link established during the second ( 
learning stage. In view of this, differences — 
in generalization effects between the two — 
paradigms would indicate a contiguity factor р 
in mediated association which would be of , 
considerable importance. Further, it becomes а 
readily apparent from this analysis that if E 
we restrict ourselves to forward associa- - 
tions, generalization effects can be expected 
only from Paradigms II (all forward links) - 
and possibly from Paradigms I and 
where purely forward links are restricted to. 
the test stage and second stage, respectively A 


An Associative Model for Mediated Gen- | 
eralization 5 


Generalizing from the available literature 4 
and based upon the analysis of the eight | 
paradigms presented above, an associative - 
model for mediated generalization will be 
put forth in order to generate specific hy- 
potheses about the generalization effects and 
the relationship of the effects among the 
eight paradigms tested here. c 
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The model to be presented represents an 
extension of the associative type of media- 
tional model proposed by Jenkins and Rus- 
sell. This model is to be restricted to purely 
verbal, paired-associate learning. This is 
not to say, of course, that these notions are 
not applicable to the nonverbal situation but 
simply that such application is not of concern 
here. 


The assumptions of the proposed model 
are: 


Assumption 1. АП secondary generaliza- 
tion, in paired-associate learning, is mediated 
in an associative manner. Beyond the 
analysis made here the exact nature of this 
process will not be specified. 


Assumption 2. Oncean A——>B habit has 
been established the presentation of A will 
tend to elicit B, explicitly or implicitly, and 
this will be true in all situations in which 
A is involved. 


Assumption 3. The establishment of a 
habit A—>B simultaneously establishes a 
habit B——+A, so that the occurrence of B in 
any situation tends to elicit A, explicitly or 
implicitly. 

Assumption 4, In establishing an A——>B 
habit, the tendency of A to elicit B will at 
all times be stronger than the tendency of 
B to elicit A. The exact nature of this func- 
tion shall remain unspecified. 


Assumption 5. Upon establishing a habit 
chain A——>B——9C, each subsequent oc- 
currence of A will tend to elicit both B and C, 
and the strength of each tendency will de- 
Pend upon contiguity. The closer the con- 
tiguity, the stronger the relationship. (In 
the above chain А — B is stronger than 
A—— C,) 


Assumption 6. Upon establishing a habit 
chain A—— —5B———.C, there develops а 
tendency for C to elicit both A and B, and 
the Strength of this tendency depends upon 
Contiguity; the closer the contiguity in the 
learning situation, the stronger the reverse 
chain. (In the example here, the tendency 
of C to elicit В is stronger than the tend 
©су of C to elicit A.) c 
. Assumption 7. The directionality of an 
implicit association takes precedence over 


the contiguity of such an association so that 
a forward remote association is a more 
effective mediator than a reverse contiguous 
association. This, of course, assumes that 
the frequency of exposure of these connec- 
tions is equal. 

Assumption 8. For all habits there exists 
a certain threshold and the strength of a 
given habit must exceed this threshold be- 
fore it will manifest itself, This assumption 
refers to the frequency with which word 
pairs must be presented to any given sub- 
ject, and of course holds only on the aver- 
age, since individual pairs may in some way 
be especially easy or difficult to learn. Such 
case or difficulty is presumably a function 
of stimulus overlap, either in the visual or 
auditory sense, of the type referred to as 
primary stimulus generalization or some 
function of previous contiguities in the par- 
ticular histories of the subjects. 


Hypotheses and Predictions 


On the basis of the above set of assump- 
tions and the analyses of the eight para- 
digms as presented earlier, the following 
hypotheses are generated (note that Hy- 
potheses 2, 3, and 4 are based on the analyses 
of the implicit and explicit associations 
which take place during the second learning 
stage: 

Hypothesis 1. АП paradigms will show 
significant generalization effects. Insofar as 
each of the eight paradigms has the possi- 
bility of mediational links being formed in 
the second stage or being represented in the 
test stage, some generalization can be ex- 
pected to occur. 

Hypothesis 2. Paradigms I, II, VI, and 
VII jointly will show a greater amount of 
generalization than Paradigms III, IV, V, 
and VIII together. Paradigms I, IT, VI, 
and УП all involve forward A--->C links 
being established in the second stage learn- 
ing, two contiguous and two remote; where- 
as Paradigms IIT, IV, V, and VIII involve 
backward links (A«-——-C) being formed in 
the second stage, again two contiguous and 
two remote. This hypothesis can be most 
easily illustrated by contrasting the two 
stimulus 'equivalence paradigms with each 
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other or by comparing the response equiv- 
alence paradigms. Figure 2 clearly shows 
that the only difference between paradigms 
within each of these subsets is the direction- 
ality of the implicit association formed in 
the second stage learning. Similar compari- 
sons are possible with the remaining para- 
digms. 

Hypothesis 3. Paradigms I, IV, VII, and 
VIII jointly will show a greater amount of 
generalization than Paradigms II, IIT, V, 
and VI taken together. Here it is postulated 
that the response equivalence paradigms and 
two of the chaining models will show greater 
generalization than the stimulus equivalence 
paradigms and the other two chaining para- 
digms due to the contiguity factor involved 
in the second stage learning. In Paradigms 
I, IV, VII, and VIII the stimulus evokes 
the mediational response which presumably 
may then be directly connected to the re- 
sponse term; whereas in the stimulus 
equivalence models and chaining Paradigms 
II and III, the response member elicits the 
mediational response which then must be 
connected to the stimulus member remotely. 
То exemplify this hypothesis, we might 
compare Paradigms I with II and III with 
IV. Here the associative connections are 
the same except for differences in the con- 
tiguity of the mediator in the second learn- 
ing stage. 

Hypothesis 4. Considering simultaneously 
the two basic postulated factors, direction- 
ally and contiguity, the paradigms can be 
divided into four groups of two paradigms 
each (in terms of the anticipated generaliza- 
tion effects). The positions of Paradigms T 
and VII as being the strongest set of para- 
digms and III and V as being the weakest 
set are derivable from Hypotheses 2 and 3; 
consequently, this hypothesis adds only a 
prediction about the four remaining para- 
digms. It is predicted that Paradigms II 
and' VI will show more generalization in 
the test stage than Paradigms IV and VIII. 
This is deliverable from Assumption 7, 
namely that directionality is more influential 
than contiguity. 


Insofar as the analysis of the eight para- 
digms tested here shows no two paradigms 


to be exactly alike in terms of possible 
mediating links, one would extend Hypoth- 
esis 4 to make specific predictions about 
the anticipated generalization effects for each 
paradigm. However, this would necessitate 
further assumptions, or extensions of the 
assumptions already put forth. Such an ex- 
tension seems premature at this point. 

То illustrate by example the difficulties in 
making further predictions at this point we 
need only to examine the effects of the 
mediational links in the test stage. Except 
for Paradigms I and II where the analysis 
appears uninvolved, it is difficult to say what 
effects, if any, the mediators in the third 
stage will have. Consequently, except for 
Hypothesis 1, the above predictions are 
based primarily upon an analysis of the 
second learning stage. Further predictions 
would seem to be contingent upon further 
empirical evidence about the relative gener- 
alization effects of these eight paradigms 
and information concerning what is happen- 
ing in the test stage. 


Summary 


The present study deals with the role of 
mediate association in verbal paired-asso- 
ciate learning paradigms. Although the his- 
tory of mediate association goes back to Hull's 
work in the 1930s and the experiments of 
Shipley (1933, 1935) and Lumsdaine (1939), 
a surprisingly small amount of research has 
been done on the topic. Most of the research 
that has been carried out in this area was per- 
formed with verbal materials in the context 
of paired-associate learning. These studies 
have suggested several factors such as re- 
verse associations (Storms, 1957) and con- 
tiguity which may be important in the media- 
tion process. Furthermore, they provide a 
basis for extensions of the model to other 
phases of the learning situation which could 
provide important information about trans- 
fer processes in general. 

Eight paradigms are tested here, two in 
which stimuli are experimentally equated, 
two in which responses are experimentally 
equated, and four in which (S——R:, 
К,——ЭК,) response chains are established. 
Five of these paradigms have been investt- 


\ 


ASSOCIATIVE FACTORS IN MEDIATED GENERALIZATION 11 


gated before with mixed results; the other 
three completely ignored. Seldom have any of 
these paradigms been compared directly 
with each other using the same materials and. 
procedures. Where comparisons have been 
made, they have involved only two of the 
eight paradigms at any one time. 

The paradigms involved are designed to 
shed light on the effectiveness of the various 
implicit links formed in connecting stimuli 
and responses, where effectiveness is meas- 
ured by ease of learning in the test stage. 
The primary importance of this study lies 
in the detection of the presence or absence of 
generalization effects in the eight paradigms 
tested, insofar as these effects may then be 
attributed to the directionality of the associa- 
tional links and the contiguity of the media- 
tors in the various stages. Secondarily, 
differences between the paradigm effects, if 
found, would also reflect on the tenability 
of the two postulated mechanisms. 


METHOD 
Description of the Task 


From the standpoint of the subjects, this experi- 
ment appeared to involve a series of paired-associate 
learning problems. The subjects were required to 
learn two lists of words to a specified criterion, 
and were then tested on a third list for a fixed 
number of trials. Each list consisted of eight pairs 
of words, and the lists were constructed in such a 
way that the third list was composed of four pairs 
Which had a common associate in the first two lists 
and four pairs which had no such connection. This 
Procedure can best be illustrated by an example 
(stimulus equivalence)—see Table 1. 


TABLE 1 
PAIRED-AssoOCIATE Lists 
Stage 1 Stage 2 Test Stage 
List x List y List z 
Ai- By C —Bj* A-C 
A; ~ By С, – В;^ А- C; 
А; – B;* С; – В; А; – C; 
Ai- B Cı- Be "An C 
А-В, Cs- Bı As- C; 
As- В, Cs- D; As - С; 
Ar- В; Cı - D; Ai- C; 
As- Bs Cs - Di As - Cs 


* Identical words. 


In the first stage, the subject learns, by the 
method of anticipation, to associate a particular B 
word with a particular A word. In Stage 2, four 
of the same B words are again used as responses 
and associated to four C words, Four other C words 
act as stimulus members of C—— 9D (control) 
pairs. In the final stage, the test stage, all of the 
A words from List x are paired appropriately with 
all of the C words from List y. During the test 
stage, the A———C pairs which have in common 
an association with a B word will presumably be 
learned faster than A——-»C pairs with no such 
connection. The amount of generalization is then 
inferred from the number of correct anticipations 
of mediated A—— .—C pairs versus the number of 
correct anticipations of nonmediated A———»C 
pairs. It should be noted that for any subject, all of 
the A words had occurred with equal frequency in 
Stage 1 and all of the C words with equal fre- 
quency in Stage 2, thus controlling for any differ- 
ences which might result from unequal familiarity 
with the material The same general principles 
were followed for the remaining paradigms. 


Learning Materials 


In order to test adequately the generalization ef- 
fects in this experiment it was felt that the stimulus 
materials should be relatively unfamiliar to the 
subjects and that there be no known connections 
between the stimuli. Ordinarily nonsense syllables 
are used for this purpose. However, learning lists 
of nonsense syllable pairs is much more difficult 
and time-consuming than learning more familiar 
material. Furthermore, it was felt that nonsense 
syllables would require a higher degree of over- 
learning in Stages 1 and 2 to establish levels of 
associative strength which would be stable enough 
to demonstrate generalization in the test stage. In 
view of these considerations real words were used 
rather than nonsense syllables. 

Five-letter words, occurring very infrequently in 
print, were selected from The Teachers Word 
Book of 30,000 Words ('Thorndike & Lorge, 1944). 
"Thus real words were used, words which had the 
phonemic and distributional properties of the sub- 
jects’ native tongue, and yet which were relatively 
unfamiliar to the subjects and consequently un- 
related to each other. 

Approximately 200 words were selected for pre- 
liminary investigation? and 110 were eliminated as 
being too difficult to pronounce or having strong 
obvious associations. The remaining 90 words were 
mimeographed in a two-page booklet in the form 


*The frequency criterion used here was that a 
word must have occurred at least .25 times per 
1,000,000 words on the general count, but not more 
than 2/1,000,000 on the Lorge magazine, Lorge- 
Thorndike semantic, and Thorndike 1931 general 
count. 
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of a free association test. This test was adminis- 
tered to 98 students in introductory psychology 
classes at the University of Minnesota. 

The results of the free association test provided 
the basis for eliminating words which suggested 
strong common associates, Twenty-eight words 
with a frequency of common response ranging from 
3 to 34 (median — 11) were selected for the experi- 
ment proper. Thus all of the words had relatively 
heterogeneous associations, although the association 
values were high, ranging from 66 to 93% (median 
— 8396). It should be noted, however, that most of 
the associations given to these words were either of 
the clang variety (e.g. krone-prone) or were based 
on the formal characteristics of the stimulus words 
(e.g., draff-draft). 

The 28 words were randomly divided into three 
lists of 8 words each and a fourth list which con- 
sisted of the 4 words to be used as controls. The 
lists were as follows: 


List 1 List 2 List 3 List 4 
MERLE NADIR ARRAS PRAWN 
BANAL DELFT KRAAL GNOME 
KRONE CAIRN DAVIT TRIPE 
UMBER VENAL BLEAR NILUM 
FAGOT WINCH  REAVE 

SWALE ETUDE BEDEW 

DRAFF LLANO LIMBO 

TRYST TAUPE NONCE 


Since the ease with which particular words or 
word pairs are learned constitutes an extraneous 
variable in this type of experiment, several controls 
were utilized to reduce any such effect to a mini- 
mum, 


1, The words in each experimental list were ran- 
domly paired with the words in every other list. 
This provided six different lists of pairs since an 
independent randomization was carried out when 
the word lists appearing in the stimulus and re- 
sponse positions were reversed. 


2. To control for any effect which might result 
from having the words in a particular experimental 
list appear only in the first or second stage, or only 
in the stimulus or response positions, each of the 
six lists mentioned above was used in the test stage 
an equal number of times for all paradigms. Since 
the connections in the test stages vary, the preced- 
ing stages necessarily vary to agree with the basic 
paradigms, 


3. The control words were randomly substituted 
for half of the experimental words in the list com- 
mon to the first two stages. For any given para- 
digm this substitution was made in the first stage 
for half the cases and in the second stage for the 
other half. With the chaining paradigms, this coin- 
cidently resulted in the control words appearing on 
a 50-50 basis with respect to the stimulus and re- 
sponse positions. This control necessitated 6 addi- 
tional list or 12 lists of pairs in all. 


4. Five randomizations in the order of each 
group of eight pairs were made to prevent serial 
learning. One restriction, that no pair could appear 
twice in succession, was imposed on this randomiza- 
tion process. 


Apparatus 


The apparatus used in this experiment was а 
Hunter 403 Cardmaster. The function and per- 
formance of this instrument is similar to that of a 
memory drum, except for the mode of stimulus 
presentation. Instead of a tape on a revolving 
drum, the cardmaster utilizes plastic cards exposed 
at a predetermined rate, for presentation of the 
stimulus materials. 

The primary advantage of the cardmaster over а 
memory drum, for this experiment, is the facility 
with which the stimulus materials may be changed. 
With each subject being exposed to 3 different lists 
of word pairs, eight paradigms being tested simul- 
taneously, and 12 different lists being used for any 
single paradigm, this was a practical consideration. 

The experimental words were typed on plastic 
tape and affixed to the approximate center of each 
half of the 3.5" x 6" plastic cards. АП letters were 
typed in pica capitals. 

The materials to be learned were exposed to the 
subject in the following order: stimulus word alone 
—2 seconds, stimulus and response words together— 
2 seconds, interpair interval—2 seconds. The card- 
master automatically returned each card, once ex- 
posed, to the end of the list so that there was no 
lapse of time, other than the 2-second interpair 
interval, between the end of the list and the begin- 
ning of its repetition. 


Subjects 


The subjects for this experiment were female 
students enrolled in introductory psychology classes 
at the University of Minnesota. These students 
mainly sophomores, either volunteered directly in 
class for the experiment or indicated on their class 
registration cards that they were willing to partici- 
pate in an experiment. The latter groups of sub- 
jects were contacted by postal card or telephone, or 
both. One hundred and fifty-seven subjects were 
used although 13 had to be eliminated for failure 
to learn one of the first two stages in the requirec 
35 trials. Allowing the subject to continue beyond 
this point would have precluded the possibility of 
the subject finishing the experiment in the 50-minute 
time ayailable. 

The experimenters restricted themselves to f emale 
subjects for two reasons. In the first place females 
are generally superior to males in verbal tasks 
(Anastasi, 1958) and it was felt that using females 
would permit a higher percentage of subjects (0 
finish within the time allowed. Second, restricting 
subjects to one sex helps to minimize experiment 
error, consequently providing a more sensitive test 
of the hypotheses involved. 
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Procedure 


The experimental room contained two tables and 
two chairs. The subjects were brought in indi- 
vidually and seated in a chair facing the table on 
which the cardmaster was placed. The stimulus 
material was kept on the second table, behind the 
first, but out of the subject's direct line of vision. 
When the subject was seated, the experimenter read 
the following directions, switched on the apparatus, 
and took a seat behind the subject, in full view of 
the cardmaster : 


INSTRUCTIONS 


Part I 


This experiment is concerned with some of the 
factors involved in verbal learning, specifically 
with the way in which pairs of words are 
learned. 

You will be asked to learn several lists of word 
pairs, each list consisting of eight such pairs. 
When the experiment begins, two shutters will 
cover this window [pointing to the cardmaster 
window] : the shutter on the left will rise first, 
exposing a word; then the shutter on the right 
will rise, exposing a second word which will al- 
ways be paired with the first word. 

Your task. will be to pronounce the left-hand 
word aloud when it appears, and then try to 
anticipate the right-hand word before the shutter 
rises. If, after saying the word on the left, you 
have no idea what the right-hand word is, or if 
you should guess the word incorrectly, then read 
aloud the right-hand word when it appears. 

Do you have any questions? 

It is important that you be familiar with the 
sound of these words as well as their spelling, so 
I am going to give you this deck of cards [hands 
cards to subject] which has the words to be used 
typed on them. I want you to look at each word 
and then say it aloud. 

Now I will read the words to you once, in 
pairs, as they will appear in the window. This 
will help you become more familiar with the 
sound of the words and make learning the list 
easier, 

[Read the fifth randomization to subject.] 

When the machine is started, remember to say 
the words aloud and to try to anticipate the right- 
hand word of each pair as soon as you can. 


Part II 


This part of the experiment will be just like 
the first except some of the words will be dif- 
ferent. Here is a deck of cards like the other one 
I gave you. [Hand deck to subject.] Go through 
the deck reading each word aloud as you did 
before. Again I will read the pairs to you to give 
you an idea of what to expect. 

[Read fifth randomization to subject.] 

Remember to pronounce the words aloud. 


Part III 


This part of the experiment will be like the 
previous ones, except no new words will appear 
and the pairings will be different. I will read 
the pairs to you once and then we will begin. 


[Read fifth randomization to subject.] 


The somewhat novel technique of reading the 
pairs of words to the subjects was the outcome of 
a pilot study. This procedure appeared to reduce 
the amount of time required to learn the lists of 
pairs without interfering with the results of the 
third stage. The fifth randomization was read prior 
to the presentation of the lists on the cardmaster. 
Thus the subject would not have the same order of 
pairs read to her that would appear in the first 
randomization on the cardmaster. 

The pilot study also indicated that a satisfactory 
learning criterion for the first two stages would be 
three successive errorless trials. This allowed mod- 
erately strong associations to be built and yet was 
attainable by most subjects within the 50-minute 
experimental session. 

Five trials or randomizations were given to each 
subject on the third stage. Further practice re- 
sulted in a decrease in apparent generalization due 
to the subjects' learning all of the pairs. 

During the experimental trials (both learning 
and generalization), the experimenter recorded the 
correct and incorrect anticipations. Once the ex- 
periment was completed the experimenter recorded 
the pertinent data and this was independently 
checked at a later time. 

The order in which the paradigms were run was 
determined at random. The subjects were assigned 
numbers (1-144) in order of their appearance and 
randomization was carried out with the aid of a 
table of random digits (Rand Corporation, 1955), 
Eighteen subjects were randomly allocated to each 
paradigm. The writers both served as experiment- 
ers, the assignment to subjects being made on a 
chance basis. 


Statistical Design 


Prior research by the authors and their associates 
had indicated that it would be advantageous to use 
each subject as her own control, since intersubject 
variability was sufficiently great to conceal impor- 
tant generalization effects. Such an experimental 
manipulation is analyzed most adequately by treat- 
ing it as a split-unit design (essentially the same 
as a split-plot design). This type of analysis yields 
information on the significance of interparadigm 
differences (whole units), intraparadigm differences 
(subunits), and the interaction between the whole 
units (paradigms) and the subunits (experimental 
vs. control pairs). The latter tests are more sensi- 
tive than the first in as much as the main effects 
are confounded in the split-unit design (see Kemp- 
thorne, 1952; Lindquist, 1956). A further consid- 
eration in the use of this design was that several 
paradigms had not been previously investigated and 
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increased sensitivity is provided on the important 
subunit tests. 

Since each subject was given only one experi- 
mental treatment, or allocated to only one paradigm, 
the estimates of the mean generalization effect and 
the variances are independent from paradigm to 
paradigm, and Ё tests can be performed on each 
paradigm. This is, of course, supplementary to the 
analysis of variance outlined above, which tests the 
effect of experimental vs. control pairs over all 
paradigms. 

Differences between the paradigms were evalu- 
ated using the Tukey hsd test (see Federer, 1955) 
where one can legitimately make any number of 
nonorthogonal contrasts or even test a posteriori 
hypotheses using confidence limits and adjusting the 
width of these limits according to the number of 
contrasts made. Each paradigm was then tested 
against every other paradigm for significance of 
difference. 


RESULTS AND EVALUATION OF HYPOTHESES 


A split-unit design divides the data into 
two subsets, one involving the whole unit or 
treatment variable and its error variance 
estimate, and a second involving the sub- 
units and subunit X treatment interaction 
with its error variance estimate. The pres- 
entation of the results will then follow this 
format: first, the analysis of the overall sub- 
unit effects and the subunit X treatment 
interaction, followed by t tests for generali- 
zation effects for each paradigm; second, a 
test of the overall whole unit or treatment 
effects, followed by specific contrast between 
paradigms as outlined by Hypotheses 2, 3, 
and 4. The dependent variable is, in all 
cases, the difference between the number of 
correct anticipations of the control pairs 
versus the number of correct anticipations 


of the experimental pairs in the five-trial 
third stage of each paradigm. 


Subunit Effects: Presence or Absence of 
Generalization 


The overall analysis of variance for all 
eight paradigms is presented in Table 2. 
The F ratio for the subunits is 69.70 which 
has a probability value of less than .000001. 
In view of these findings, which are sur- 
prising only in their magnitude, there can be 
little doubt about the occurrence of generali- 
zation with models such as these. However, 
the question still remains as to which of 
the paradigms account for these effects. 

When we turn to the individual paradigms 
involved in this study, we find the effects 
equally positive. Table 3 presents the means, 
standard deviations, t values, and appropriate 
probabilities for each of the eight paradigms. 
The fact that all paradigms except one sur- 
pass a significance test at the .05 level, 
coupled with the results of the overall Е 
test for all eight paradigms, leaves little 
doubt as to the general tenability of Hy- 
pothesis 1. Tt should be noted that all para- 
digms except III and IV would pass а 
significance test at the .01 level. Of course, 
some question arises concerning the failure 
of Paradigm III to show significant general- 
ization. But with the similiarity of mediation 
processes in Paradigms III and IV, as well 
as the highly significant generalization ob- 
tained with the remaining paradigms, it 
seems possible that the lack of more positive 
results with Paradigm III may be attributed 


TABLE 2 


ANALYSIS OF VARIANCE FOR THE EIGHT PARADIGMS 


Source df SS MS К 
Treatments (T) 7 149.6111 21.3730 0.90 
Error (a) 136 3228 .5000 23.7390 
Subunits (S) 1 485.6805 465.6805 69.70** 
ST 7 80.0417 11.4345 1.60 
Error (b) 136 947.2776 6.9635 

Total 287 | 4891.1111 


**p < .01. 
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TABLE 3 


MEANS, STANDARD DEVIATIONS, AND £ 


TESTS OF THE GENERALIZATION EFFEcts (DIFFERENCE 
BETWEEN EXPERIMENTAL AND CONTROL Wors) ғов EACH ОЕ THE EIGHT PARADIGMS 


Paradigm 
Statistic Response chaining Stimulus equivalence Response equivalence 
I II HI IV | V VI VII VIII 
M 3.06 2.28 ES 1.83 2.39 4.00 3.61 3.17 
SD 3.39 3.06 3.36 3.62 3.93 5.05 3.65 3.45 
1 3.83 3.17 -56 2.15 2.58 3.36 4.20 3.91 
b .0008 .003 202 -025 | .009 .002 .0003 -0006 


to an individual difference factor occurring 
by chance in the subject sampling process. 
This possibility is partly supported when the 
experimental and control pairs are con- 
sidered separately. Table 4 presents the 
relevant data. 
Note that the total for control pairs is larger 
for Paradigm ПТ than for the remaining para- 
digms while the total for experimental pairs 
is virtually identical with the total for Para- 
digms II and IV. Of course such data are 
by no means conclusive, but the possible 
importance of individual differences is at 
least suggested here. Some additional con- 
siderations on this point will be presented 
in the discussion section. 

The F ratio for the subunit x treatment 
interaction fell short of the value necessary 
to reject the null hypothesis at the .05 level. 


This adds support to whole unit analysis to 
be discussed Shortly, which shows no evi- 
dence for treatment di fferences, 


Whole Unit Variation-Differences between 
Paradigms in Generalization Effects 


Table 2 indicates that the F ratio for the 
whole unit variable is 90, which is not 
significant. Ordinarily one would stop at 
this point and perform no further tests, in- 
asmuch as such tests would not be statisti- 
cally meaningful. However, in exploratory 
research, one frequently makes such con- 
trasts anyway, to gain a more sensitive view 
in the point estimates. Such tests were made 
utilizing Tukey’s hsd (cf. Federer, 1955), 
and even the largest difference, the differ- 
ence between Paradigms III and VI was 


TABLE 4 


TOTAL Correct RESPONSES ON EXPERIMENTAL (E) AND CONTROL (C) Pairs For 
EACH OF THE EIGHT PARADIGMS 


Paradigm 
Pair Response chaining Stimulus equivalence Response equivalence 
1 п III IV V VI VII VIII 
E 188 149 151 147 146 180 192 178 
[o 133 108 143 114 103 108 127 121 
Total 321 257 294 261 249 288 319 299 
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not great enough to reject the null hypoth- 
esis at the .05 level. Thus the null hypoth- 
esis counterparts of Hypotheses 2, 3, and 
4 cannot be rejected. 

The reader will recall that in the discus- 
sion of split-unit analyses it was pointed 
out that the gain in sensitivity for the sub- 
units and subunit X treatment interaction is 
at a sacrifice in sensitivity in evaluating 
differences between treatments (or para- 
digms); therefore, it remains worthwhile 
to examine the nonconfirmed hypotheses in 
terms of the point estimates (means) of the 
generalization effect. Table 3 includes the 
means and standard deviations of each of 
the eight paradigms. 

Hypothesis 2 stated that Paradigms I, II, 
VI, and VII would show greater generaliza- 
tion than Paradigms III, IV, V, and VIII. 
The difference between the mean effects of 
these groups is 1.28, a difference of 
moderately large magnitude and in the right 
direction. Thus Hypothesis 2 is confirmed 
with respect to the directionality of the 
differences, 

Hypothesis 3 stated that Paradigms I, IV, 
VII, and VIII would show larger amounts 
of generalization than Paradigms IT, III, V, 
and VI. The difference between the means 
of these sets of paradigms is .69, and again 
the directionality is confirmed though the 
magnitude of the difference is small and un- 
impressive. 

Hypothesis 4 was not stated in a statisti- 
cally testable form unless one is willing to 
hypothesize that all sets of paradigms will 
be significantly different from all other sets 
of paradigms. This hypothesis predicts that 
in the order of the magnitude of the general- 
ization effect the paradigms will align them- 
selves in groups with Paradigms I and VII 
showing the greatest generalization effects, 
followed by Paradigms II and VI; IV and 
VIII were predicted to be the third strongest 
set and Paradigms III and IV were expected 
to be the weakest. From Table 3 it can be 
seen that all of these predictions held. The 
mean generalization effects for the four 
groups are as follows: 


land VII IIand VI IVand VIII Ill andV 
3.34 3.14 2.50 142 


Thus Hypothesis 4 is also confirmed with 
respect to direction. Again, no differences 
are significant. 

In summary then, it may be said that 
Hypothesis 1 was strongly confirmed by 
the overall analysis of variance and by 
individual t tests of the paradigms; Hypoth- 
eses 2, 3, and 4 lacked statistical confirma- 
tion, but these hypotheses were correct in 
predicting directionality of observed differ- 
ences and in predicting the relationship be- 
tween the paradigm effects. 


Discussion 


The model presented earlier is based on 
the assumption that mediated generalization 
occurs in all paired-associate learning situa- 
tions in which some common element has 
been associated with two otherwise independ- 
ent elements. The present study was designed 
to test this assumption and to investigate the 
nature of the mediational process in a par- 
ticular subset of verbal learning tasks. 

In addition to the overall effects, it was 
postulated that two factors in the mediation 
process would determine the amount of gen- 
eralization to be expected in the eight para- 
digms. These two factors, directionality of 
associative connections and contiguity be- 
tween explicit and implicit elements, were 
used to make predictions about the relative 
magnitude of generalization effects in these 
paradigms. These factors are illustrated in 
the paradigms presented in Figure 4. 


1 І ж x 
в—=с 
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Fic. 4. Mediate associations in the eight para- 
digms. 
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The results presented in the preceding 
section leave little doubt as to the plausi- 
ЫШу of the general assumption, at least 
with paired-associate learning paradigms. 
The magnitude of the F ratio of the sub- 
unit effects coupled with the individual f 
tests, seven of which have probability values 
less than .05, can be interpreted as a strong 
confirmation of Hypothesis 1, A possible 
exception exists, due to the failure of one of 
the chaining paradigms to show significant 
generalization effects. 

The postulated bidirectionality of associa- 
tional links also received considerable sup- 
port from this experiment. It can be seen 
that six of the seven Significant paradigms 
utilize at least one reverse association in 
either the second stage learning or the test 
Stage. Although most paradigms rely on 
Some combination of the forward and re- 
verse association to account for the gener- 
alization effect, three of the significant para- 
digms have a reverse connection in both 
mediational stages (Paradigms IV, V, and 
VIII). The importance of forward associa- 
tions is clearly démonstrated in the magni- 
tude of the generalization effects for Para- 
digms T, TI, VI, and VII, whereas support 
for the factor of reverse association is pro- 
vided by the presence of generalization 
effects in Paradigms IV, V, and VIII. 
Presumably, if it were not for these reverse 
connections, Paradigms IV, V, and VIII 
Would show no generalization effect at all. 
These results strongly suggest that forward 
associations alone simply do not provide an 
adequate basis for explaining all mediational 
effects. It should be noted that this finding 
has obvious implications for advocates of 
representational models of mediation, since 
hone of these models provide for bidirec- 
tionality. 

Even though the predicted differences be- 
tween paradigms with forward and reverse 
Connections were not significant, the data 
Concerning the bidirectional factor also lends 
Support to the assumption about the relative 
Strength of forward and reverse associa- 
tions. Although this investigation was not 
designed to determine the empirical relation 
of these connections, the data suggest that 
forward links are stronger. This can be 


illustrated by comparing Paradigms V with 
VI and VII with VIII. The only difference 
between V and VI is in the directionality 
of the implicit association formed during the 
second stage learning trials. From this, we 
predict that VI will show more generaliza- 
tion than V. It can be argued that Para- 
digm VII should produce more generaliza- 
tion than Paradigm VIII on a similar basis, 
Both of these predictions are confirmed. In 
addition the relatively small mean generaliza- 
tion effects of Paradigms III and IV, which 
have reverse mediational connections. lend 
additional support to this assumption.t The 
fact that a consideration of the directionality 
factor alone does not lead to a clear-cut 
separation of the paradigms may be partially 
accounted for by the multiplicity of factors 
which evolved—both those which were pos- 
tulated and others which were suggested by 
the data. І 
The second factor proposed, contiguity in 
the second stage mediation process, is more 
difficult to evaluate. As mentioned in the 
results section, the difference between sets of 
paradigms with postulated contiguous media- 
tors as opposed to those with noncontiguous 
mediators was small, but in the predicted 
direction. Unlike the bidirectionality factor, 
Presence or absence of generalization in any 
paradigm is not a crucial test of this 
mechanism. In view of this, perhaps the 
best supplementary evidence can be gained 
by contrasting those paradigms which differ 
only in the contiguity of the second stage 
mediator. Such differences are found only 
among the chaining paradigms. By con- 
trasting Paradigm I with II and III with 
IV, it can be shown that differences in. gen- 
eralization effects are observed and: these 
differences are larger than the pooled effect 
of all contiguous vs. noncontiguous media- 
tors (Hypothesis 3). Although these inter- 
paradigm effects аге not significant, the 
consistency of these results lends consider- 
able support to the contiguity argument. 
Of course, alternative explanations, ' eg. 


*It should be pointed out that the paradigm 
which had all forward associational links, Para- 
digm II, did not show the greatest amount of 
generalization. А 
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the magnitude of generalization depends 
only on the directionality of the associa- 
tive links, or contiguous and remote con- 
nections are equally strong, remain plausible. 
On the basis of the experimental design 
used here, it seems premature to make any 
final judgment as to the importance of the 
contiguity factor, although its operation re- 
ceived some support. 

The final assumption made in the pro- 
posed mediational model was that direction 
of association would be a more important 
factor than contiguity. The results given 
earlier also support this assumption. 
Although no differences were significant, 
the mean generalization effects, for pairs of 
paradigms, were in the exact order pre- 
dicted in Hypothesis 4. Further evidence 
on this assumption may be gained by con- 
sidering separately the chaining paradigms 
and the equivalence models. Hypotheses 
analogous to Hypothesis 4 can be generated 
for each set of paradigms, and within each 
set the paradigms can be ordered on the 
basis of the two principles under investiga- 
tion. The data in Table 3 indicate that this 
way of viewing the results leads to complete 
predictability in the rank order of the chain- 
ing paradigms and only one deviation in the 
predicted order of the equivalence para- 
digms. This interpretation involves no 
change in the factors previously mentioned, 
either as to operation or importance, and 
the one deviation from predicted order in- 
volves only the less well-established con- 
tiguity principle. Even this one deviation 
may be due to differences in heterogeneity 
of subjects (see Table 3) rather than a 
failure of the contiguity principle. Of 
course, these are somewhat speculative de- 
ductions and must wait verification using 
an experimental design which maximizes 
sensitivity on interparadigm effects. 

In conclusion then, the data obtained in 
this experiment strongly confirm the bi- 
directional interpretation of mediational 
processes. In addition, the importance of 
implicit connections formed during the sec- 


5 These paradigms were separately treated in 
two previous papers by Horton (1959) and 
Kjeldergaard (1960). 


ond learning stage is strongly supported 
on the basis of confirmed predictions based i 
on either bidirectionality or contiguity of ex- ; 
plicit and implicit terms. l 


Facilitation vs. Interference 


The analysis presented here assumes that 
the generalization effects obtained are truly 
the result of facilitation in the learning of 
experimental pairs, and not the result of in- 
terference in the learning of control pairs. 
Either or both interpretations are possible; 
The difference can be illustrated by examin- — 
ing the test stage for mediational and non- - 
mediational pairs in Paradigm I as illus- 
trated in Figure 5. 

Whether one argues for facilitation or in- 
terference, the mediational pairs will be 
learned faster. However, the interference 
argument is based on the assumption that 
the A——>D connections will slow down or 
interfere with the learning of A——>C pairs, 
whereas the facilitation argument assumes” 
that the A——>C pairs will be learned faster” 
due to the implicit chain of previously 
formed associations. No resolution of this 
difference is possible here, but an analysis 
of the results from the second stage, which 
provides a comparable situation, is informa= 
tive. The comparison analogous to that pr 5 
sented above сап be best illustrated usi 
the classical interference model (Para- 
digm VII). + 

Stage 1 B/D——>A 
Stage 2 B———C 


Keeping in mind that D represents со 
words it can be seen that half of the pairs 
involve stimulus words in the second stage 
which are entirely new terms (i.e., B words. 
not present in Stage 1). The remaining 
pairs contain B words previously associated. 


m 
Mediation Pairs Non- Mediation Pairs 
(В) (р) ЖЕ 
А E A—-C 
Fic. 5. Experimental versus control pairs in the 


test stage of Paradigm I. 
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TABLE 5 


DIFFERENCES IN THE NUMBER OF CORRECT ANTICIPATIONS ON FAMILIAR PAIRS Minus 
NEUTRAL Parrs (F — N) DURING STAGE 2 LEARNING 


VIIT 


with A words. Facilitation in the latter case 
would be explained on the basis of famili- 
arity with these B words and interference 
would be accounted for by assuming that 
previously learned B——>A associations 
make it difficult to learn the new B——>C 
connections. Tn any case, if the connections 
between new B terms (i.e., those not appear- 
ing in Stage 1) and C words are formed more 
rapidly than the connections between fa- 
miliar B terms and associated C words, the 
interference position would be strengthened. 
The data relevant to this comparison are 
given in Table 5. 

The results do not favor either interpreta- 
tion as the interference or facilitation effects 
vary widely from paradigm to paradigm and 
the mean interference effect (F-N ) over all 
paradigms is only —.18. Since the inter- 
ference effect is so small in this case, it 
seems unlikely that interference could ac- 
count for much of the test stage generaliza- 
tion effect. The general lack of support for 
familiarity has no real implication for the 
test stage. Thus, it appears that the gener- 
alization effect observed in this experiment 
is in all probability the result of facilitation 
in the learning of experimental pairs as the 
model suggests. 


The Nonsignificant Paradigm 


One of the questions arising from this in- 
vestigation concerns the failure of Paradigm 
III to show significant generalization. If 
this lack of significance is viewed as indi- 
cating no mediational effect, then serious 
question arises concerning the existence and 
importance of the reverse associative factor 
, Proposed here. However, since the media- 
- tional factors in Paradigm III are so simi- 
lar to those of Paradigms IV and V, which 


showed significant generalization effects, 
Other interpretations seem more probable. 

Perhaps the most parsimonious explana- 
tion of the results obtained with Paradigm 
III is that the degree of generalization is 
low. If this is the case, a larger number of 
word pairs or a greater number of trials in 
the learning stage will be needed in order to 
demonstrate significance. It should be re- 
membered that Paradigm III was expected 
to be weaker than most others and that the 
paradigm next lowest in observed generaliza- 
tion effect, ТУ, was identical to Paradigm 
III with respect to the important direction- 
ality factor. 

An alternative way of viewing the results 
is that a mediating effect exists with Para- 
digm III but that it is lost with the particu- 
lar subjects used in this experiment. If this 
is the case, the data should reveal some 
peculiarities for these subjects. In an earlier 
section it was noted that the subjects in 
Paradigm III showed a high frequency of 
correct anticipations on the control pairs 
(more than any other paradigm) while their 
frequency on the experimental pairs was ap- 
proximately equal to that of the subject in 
Paradigms II, TV, and V. These data sug- 
gest that the subjects of Paradigm III were 
"fast learners," a point which is further sup- 
ported when the data of Table 6 are ex- 
amined. It can be noted that the Paradigm 
TII subjects are the fastest on the important 
second stage list, and second fastest on total 
trials. Now if this speed of learning in- 
terpretation is valid for these subjects, on 
the paradigm which is theoretically one of 
the two weakest, it is possible that the fre- 
quency of exposure to the pairs was not suf- 
ficient to bring the associative connections 
above threshold. The other two paradigms 
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TABLE 6 


TRIALS TO CRITERION IN THE LEARNING STAGES OF EACH PARADIGM 


Stage І п ш IV у VI уп VIII 
1 288 360 287 274 310 267 337 229 
2 239 235 182 197 215 188 200 203 
Total 527 585 469 471 525 455 537 502 


(IV. and VI) which reflect comparable 
learning speed show either a more favor- 
able contiguity or directionality factor which 
may have required fewer exposures in the 
learning stages to show generalization effects 
in the test stage. 

Obviously post hoc analyses such as these 
must be viewed somewhat skeptically. The 
next step would be to perform another ex- 
periment using more word pairs or requiring 


paradigms showing significant generaliza- 
tion (ie., all except III), it can be seen that 
25% of the subjects do not generalize. This 
figure represents too great a portion of the 
sample to attribute the results to experi- 
mental error, though many investigators 
with similar results seem to ignore this 
phenomenon. Such a resolution of the prob- 
lem is untestable and should not be used in 
the absence of more serious attempts to. in- 


a greater exposure frequency. Such an vestigate alternative explanations. a 
experiment would provide a satisfactory Since earlier investigations (e.g., Peters, 
answer to the questions raised here. 1935) had encountered the problem men- 
tioned above, an attempt was made during 
Subjects Who Fail to Generalize the experiment to look for potential ex- 
à e. Р planatory factors. One way of doing this, 
The present investigation leaves little however crude, was to question the subjects 
doubt concerning the existence of generaliza- about their method, and attitude toward the 
tion effects with seven of the eight para- experiment, after they had served as sub- 
digms. However, before any definite con- jects. This was done whenever time per- 


clusions can be drawn about the process 
involved, the failure of several subjects to 
generalize must be examined. Table 7 indi- 
cates the number of subjects who generalize 
(i.e, learn more experimental than control 
pairs) and do not generalize (i.e, experi- 
mental pairs are equal to or less than control 
pairs) for each paradigm. 

Using this somewhat liberal definition of 
generalization and considering only those 


mitted. It was felt that data obtained in 
this manner might reveal such things as the 
importance of difference in the speed with 
which subjects "catch on" to the experiment. 
The data obtained indicate that none of 
the subjects questioned realized the exact 
nature of the task. This appeared to be the 
case whether they were questioned on this 
point before or after the experiment was 
explained. As Bugelski and Scharlock 


TABLE 7 


INDIVIDUAL SUBJECTS SHOWING GENERALIZATION IN EACH PARADIGM 


Subject I II ш IV у VI VII ҮШ 
Generalizers 15 14 9 11 14 14 14 13 
Nongeneralizers S 4 9 7 4 4 4 5 


a oe 


- 


] 
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(1952) discovered, there appeared to be no 
realization of the presence of a common 
term as a mnemonic device during the learn- 
ing stages. Furthermore, the subjects were 
so completely unaware of this factor that 
they accepted erroneous explanations of the 
experiment as readily as the true one. Thus, 
it seems clear that the experimental findings 
presented here are not dependent on “aware- 
ness,” “insight,” or conscious use of asso- 
ciative mediators. Yet within this apparent- 
ly “unaware” group, differences are found 
which need to be accounted for. 

Within the context of the associative fac- 
tors suggested earlier, an obvious variable to 
examine in accounting for subjects failing 
to generalize is that of individual threshold 
level. Some subjects (due to individual 
difference factors of a sort) may not have 
received sufficient exposure on the pairs to 
raise the habit connections, necessary for 
mediation to take place, above threshold. 
Although this explanation may be acceptable 
for Paradigm III, and is roughly supported 
by a comparison of the data presented in 
Tables 6 and 7, the account is not especially 
impressive for the remaining paradigms. 
Of course, this is not to say that the ex- 
posure argument is totally irrelevant, be- 
cause exposure frequency may be related to 
other factors, 

Examination of the data obtained in this 
experiment and questioning of the subjects 
Suggest several factors which appear to be 
related to the degree of generalization. Per- 
haps the most obvious of these variables is 
the speed with which individuals learn. 
When the amount of generalization is ex- 
amined as a function of the number of trials 
required to learn Stage 1 (see Table 8) it 
can be seen that the mean level of generaliza- 
tion increases sharply from the first to the 
Second block of trials, and after a drop off, 
Temains at a level considerably above that 
for the first block. This increase in general- 
ization supports the threshold argument pre- 
Sented above. It should be noted that this 
Comparison is not rigorous since there is 
ho effective counterbalancing with respect to 
lists, paradigms, or subject learning speed. 

Examination of the raw data shows that 
the mean level of generalization among the 


10 subjects taking six to eight trials on 
Stage 1 is only .60. Just why this occurs 
is not clear, although it may be that in- 
dividuals who are very fast learners may 
operate under some combination of set or 
ability factors which are different from those 
of the other subjects. Such factors may lead 
to extremely rapid learning but at the same 
time decrease the likelihood of generaliza- 
tion. 

That the learning factor mentioned above 
may be a complicated one is supported by 
the questioning of subjects. Such data sug- 
gest two factors which in turn may interact 
with whatever ability factors are involved 
in paired-associate learning. The first of 
these is concerned with the subject’s con- 
ception of what he is supposed to do; this 
might be referred to as task set. The second 
is related to the subject’s method of strategy 
(ie. how he performs the task). For ex- 
ample, it is possible that the usual instruc- 
tions establish a set, or determine a method, 
that increases the number of exposures 
necessary to raise habit strength above thresh- 
old. The exact nature of such sets, and the 
degree to which they can be modified by in- 
structions, must be determined experimen- 
tally. However, consider the possibility that 
there is some set, or method, or combination 
of the two, for which the frequency of ex- 
posure necessary to raise connections above 
threshold is at a minimum. Now, if either 
the set or method is not of this optimum 
variety it may be that the exposure fre- 


TABLE 8 


AMOUNT OF GENERALIZATION AS A FUNCTION 
or LEARNING SPEED IN STAGE 1 


Generalization 

Trial 
M N 
6-10 1.48 23 
11-13 3.38 24 
14-15 2.46 24 
16-18 2:99 30 
19-24 2.38 21 
25-35 2.55 22 
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quency must be greater to form connections 
of equal strength. These factors might be 
investigated by varying the type of instruc- 
tions given to the subjects, since the instruc- 
tions presumably play an important part in 
determining task set and strategy. 

In view of the above considerations sev- 
eral parameters can be suggested which 
should be important in the type of generali- 
zation situation investigated here. First, each 
paradigm presumably requires a certain fre- 
quency of exposure during the learning 
stages. This frequency is a function of the 
variety of associative connections required 
for mediation. Second, each individual has 
an exposure parameter with respect to each 
paradigm, although it is likely that the rela- 
tive rank for each person, in terms of num- 
ber of trials, is approximately the same 
across all paradigms. Third, some sort of 
optimum set exists for the production of 
mediated generalization. Fourth, and last, 
some sort of optimum strategy exists for 
producing generalization. Of course, it 
should be made clear that individual differ- 
ences in the ability to generalize will not 
disappear even under ideal conditions. Such 
an ability, which may involve the breadth or 
narrowness of previous learning, the ability 
to put two and two together, or the tend- 
ency toward concrete or abstract thinking, 
presumably determines the upper limit for 
generalization in any such situation. 


Conclusions 


The data obtained in this investigation 
tend to support the notion that generalization 
can be obtained with all mediational models, 
although Paradigm III may require addi- 
tional learning trials in order to raise the 
strength of reverse connections above thresh- 
old, or more mediational pairs to demon- 
strate the effect. In addition, the associative 
factors suggested here, at least those con- 
cerning the strength of forward and reverse 
associations and their corresponding thresh- 
olds, tend to be supported. Further re- 
search, however, will be needed to establish 
more clearly the influence of the factor of 
contiguity in association. Finally, there is a 
suggestion that some kind of task set or 
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strategy may be important in explaining the 
lack of generalization obtained with some 
of the subjects in paradigms showing 
highly significant generalization. 


SUMMARY 


This investigation was designed to shed 
light upon the role of mediate association in 
verbal generalization processes. Although 
the systematic formulation of this concept 
goes back to Hull's early papers, the research 
dealing with it has been quite limited in 
scope and relatively unsystematic. The 
previous work on such models has been 
done largely with verbal materials in the 
context of paired-associate learning. These 
studies have suggested several factors, such 
as reverse association and contiguity of 
mediating links, which may be important in 
the generalization process. In addition, 
these factors lead to the consideration of 
mediate association models where the media- 
tion takes place during the learning stages 
as well as the traditional final stage. The 
present experiment constituted an attempt 
to discover which of eight paradigms would 
show positive facilitation effects and which 
factors might account for such facilitation 
as well as differences between paradigms 
in the degree of facilitation of generaliza- 
tion. 

The experiment involved a three-stage 
paired-associate learning task in which stim- 
uli or responses were equated, or in which 
response chains, e.g., A——B, в—эС, 
were formed, in the first two stages, fol- 
lowed by a test stage in which the equated 
stimuli or responses, or the first and last 
members of response chains, were paired 
with each other and the amount of generali- 
zation or facilitation noted. Combining wor 
pairs into a three-stage paired-associate 
learning task yields eight paradigms in all: 
four response chaining, two stimulus 
equivalence, two response equivalence. 

An analysis of the possible mediating 
links in these paradigms leads to the con- 
clusion that for most of the paradigms to 
show any generalization effects, backwar© e 
or bidirectional association seems to be à 


necessary explanatory construct. That i$ | 
| 
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one has to postulate that the learning of an 
A—— B pair simultaneously results in the 
formation of a B—— >A habit, the strength 
of which is some function of the A——>B 
pairings. Several other investigators have 
provided data which support such an as- 
sumption (Jantz & Underwood, 1958; 
Murdock, 1956, 1958; Russell, 1955). Most 
of the available theories of mediated general- 
ization, however, have completely ignored 
the problem of backward or bidirectional 
mediators (cf. Cofer & Foley, 1942; 
Mowrer, 1954; Osgood, 1952); Jenkins 
(1955) and Russell (1955) have dealt with 
it informally. 

A second factor, contiguity of the media- 
tor in the second stage learning, was postu- 
lated to account for anticipated differences 
between certain subsets of the paradigms, 
namely, two chaining paradigms and the 
Stimulus equivalence models versus two 
other chaining paradigms and the response 
equivalence models. In the stimulus equiva- 
lence models and two of the chaining para- 
digms, the second stage implicit response, 
( ), would necessarily follow the explicit 
response term; whereas in the response 
equivalence models and the other two chain- 
ing paradigms, the mediator would presum- 
ably follow the stimulus and antedate the 
response. Thus the second stage processes 
in the stimulus equivalence and two of the 
four chaining paradigms can be conceptu- 
alized as learning the chain C——»>B——> 
(A) or A——+B——>(C), as opposed to 
the cases of the response equivalence models 
and the other two chaining paradigms where 
the chains would be B——9(A)——9C or 
В——(С)———›А. Since the test stage for 
all paradigms involved learning an A——>C 
pair, it was assumed that the contiguous me- 
diational links formed in the second stage 
Pairing in the response equivalence models 
and two chaining paradigms would show more 
Seneralization strength than the more remote 
Pairings in the stimulus equivalence para- 
digms and in the other two chaining para- 
digms, 

The materials used in the experiment 
i Were low frequency five-letter real words 
Selected from the Thorndike-Lorge lists 
(1944). A large number of words selected 


on a frequency criterion were subjected to 
tests for association value. The words 
finally selected were randomly assigned to 
four lists and these four lists were per- 
muted to form 12 lists of eight word pairs. 
These 12 lists permitted complete counter- 
balancing of list difficulty in the test stage 
and partial counterbalancing of list difficulty 
in the two learning stages. 

Earlier experimentation had suggested 
that individual differences in paired-asso- 
ciate learning are so great that they might 
easily mask the generalization effect unless 
these differences were controlled. Therefore, 
a split-unit design was utilized with each 
subject acting as her (all female subjects) 
own control. The test stage of each para- 
digm was made up of eight word pairs, 
four of which were generalization pairs and 
four of which were control pairs. The de- 
pendent variable, then, was the difference 
between the generalization pairs and the 
control pairs. 

The results of this investigation clearly 
establish the presence of generalization 
effects for all paradigms except one of the 
chaining models. The F ratio for subunits 
(a test of the null hypothesis that there were 
no mediational effects in any of the para- 
digms) was 69.70. An F ratio of this mag- 
nitude could be expected to occur by chance 
about one time per million. Of the eight + 
tests performed on the individual paradigm 
effects, seven were significant at the .05 
level or beyond. No paradigm was signifi- 
cantly different from any other, though the 
predictions about the relative magnitudes of 
the generalization effects for the eight para- 
digms made on the basis of the two postu- 
lated factors, bidirectionality and contiguity 
of mediators, were all confirmed with re- 
spect to the direction of the observed differ- 
ences. The results for the nonsignificant 
paradigm were positive, though small, 

A split-unit analysis sacrifices between- 
paradigm sensitivity in favor of the sen- 
sitivity in detecting within-paradigm effects; 
therefore, final conclusions about paradigm 
differences must be held in abeyance. 


These results confirm the importance of 
both forward and reverse mediate associa- 
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tions in the verbal transfer process. Fur- 
thermore, there is sufficient evidence for 
both contiguous and noncontiguous mediate 
associations and implicit rehearsal of previ- 
ously learned associations to warrant further 
research on these factors. In an attempt to 
discover reasons for some subjects' failing 
to generalize with paradigms which show 
in general highly significant generalization 
effects, whenever possible the subjects were 
questioned about their attitudes and strate- 
gies. The answers given suggest that task 
set (e.g., what is supposed to be done?) and 
strategy (e.g, how is it supposed to be 
done?) may be important variables in ob- 
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taining generalization. In any case, further 
investigation of these variables seems neces- 
sary. 

Tn conclusion, this study has clearly estab- 
lished the presence of generalization effects 
with three chaining, two acquired stimulus 
equivalence, and two required response 
equivalence paradigms, and has strongly 
suggested the presence of several associa- 
tive factors and indicated a need for further 
investigation of others. In addition, in- 
dividual difference factors, such as task set 
and strategy, appear as important param- 
eters in generalization and warrant further 
investigation. 


— m. 
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N 

v. GROUP's effectiveness is at one time or 

А another a function of organizational, 
echnical: motivational, personal, and inter- 
personal variables. However, the attempt to 
identify specific variables and to demonstrate 
through empirical studies their consistent re- 
lationship with performance has often led 
to frustration. Explanations for the failure 
include the use of contrived as compared 
with normal group tasks, subjects (Ss) 
whose knowledge of one another is largely 
а consequence of the research itself, inap- 
propriate formulation and administration of 
independent or predictor variables, unreliable 
measures of group performance, and the use 
of a limited number of groups. It was hoped 
that the awareness of these pitfalls would 
benefit the present effort. 

This research focused on the contribution 
to group effectiveness of one aspect of the 
leader-men relationship. The general hypoth- 
esis is that the more a leader satisfies the 
needs of his men, the more effective group 
Performance. The hypothesis derives from 
a theory of interpersonal relationships to the 
effect that when Individual A satisfies a 
need of Individual B, then Individual B 
tends to satisfy a need of Individual A. 


* Studies A and B were done under the egis of 
the now defunct Institute for Research in Human 
Relations, Psychological Research Associates, 
Incorporated conducted Studies C and D. Appreci- 
ation is expressed to the latter organization for 
Providing some of the original data to use in pre- 
Paring this article. Studies A, B, and C were 
Sponsored by the Personnel Research Branch of 
The Adjutant General's Office, Department of the 

- Army. The Office of Naval Research funded fur- 
- ther analyses of data collected in Study B. Study 

Was supported by the former Air Force Per- 
Sonnel and Training Research Center. Portions 
of Studies A and B formed a part of a disserta- 

ў tion submitted in partial fulfillment of the require- 
. ments for the degree of Doctor of Philosophy in 
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A need and its satisfaction may be material 
or immaterial involve attitudes and. be- 
haviors. A and B may be conscious or un- 
conscious of the situation. Nevertheless, for 
the theory to be relevant there must be a past 
relationship or experience of satisfaction 
for B with A. When B, then, satisfies in 
return a need for A, there is the “recipro- 
cality of indulgences.” Of course, although 
A may satisfy a need for B, the latter may 
not always be able to reciprocate a need 
satisfaction for А. ‚ 

Individual A. can represent a leader and 
Individual B any member of his group. As- 
sume a leader satisfies certain needs of his 
individual group members. Group members 
can often reciprocate a leader's indulgences 
to them through effective performance, since 
a group's behavior generally reflects directly 
on a leader. 

Most research in the past on effective 
leadership, and consequently on. effective 
group performance, has centered on the 
nature of the leader. How competent is he 
technically? What are his personality char- 
acteristics? Insufficient attention has been 
devoted to group members and their rela- 
tionship to the leader or his to them. The 


the Department of Social Relations at Harvard 
University. The points of view or findings con- 
veyed explicitly or implicitly are not to be con- 
strued as necessarily reflecting the position of any 
element or component of the Department of De- 
fense, Moreover, this article is not an official pub- 
lication under any contract. Many people have 
contributed in various ways to this series of re- 
search studies. Particular appreciation is deserved 
by all the authors of government contract reports 
listed in the references. Special thanks are ex- 
tended to Gardner Lindzey, dissertation advisor. 
Last but not least, I feel a deep sense of indebted- 
ness to hundreds of military personnel without 
whose many hours of grueling participation in the 
projects this article could not have been written. 
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notion has not been adequately explored 
with empirical tests that a leader's effective- 
ness might be a function of the characteris- 
tics of the group members and the resultant 
relationship with the particular leader. 

Some studies have been concerned with 
the effects on group performance of the type 
of leadership or group atmosphere. White 
and Lippitt (1953), studying adult leader- 
ship patterns for boys' clubs, concluded that 
although not as much work was produced 
in the democratic boys' club as the autocratic 
club there was in the former greater work 
motivation and greater originality. The 
laissez-faire boys’ club was inferior in 
quality and quantity of work to the other 
clubs. The Katz, Maccoby, Gwin, and Floor 
(1951) study, where productivity of railroad 
workers was positively related to the super- 
visor's differentiation of his role with others, 
and the Berkowitz study (1951), where con- 
ference morale was negatively correlated 
with the chairman's sharing of his role, lend 
support to the White and Lippitt findings 
of the ineffectiveness of the laissez-faire 
group. 

McCurdy and Lambert (1952) and 
Adams (1954) stated that neither a demo- 
cratic nor an authoritarian atmosphere 
proved the more favorable for group per- 
formance. In his Prudential study, where 
the emphasis was not on the group atmos- 
phere, Pelz (1951) found that supervisors 
of high-producing work groups either made 
recommendations which generally went 
through for the promotion of the workers or 
they made no recommendations at all. On 
the other hand, supervisors of low-producing 
work groups often recommended promo- 
tions which were generally not granted. 
These results are somewhat consistent with 
the hypothesis that the leader who does help 
his people has an effective group. 

In analyzing a leader’s behavior toward 
his group and group effectiveness, investi- 
gators have assumed the uniformity of mem- 
ber needs across groups. However, when 
needs are derived from general attitudes or 
values, individuals, and hence, groups are 
likely to vary in their needs. Consequently, 
a leader's ability to satisfy group member 
needs which stem from values presents 


another possible determinant of group effec- 
tiveness. 

Values or general attitudes, regardless of 
when or how they develop, can affect one's 
expectations or desires concerning how 
others should behave, and also determine 
one's own behavior. Authoritarians and 
equalitarians were found by Sanford | 
(1950) to vary consistently in what they be- 
lieved were proper role behaviors for a 
variety of leaders in society. Sanford 
stressed the importance of the follower's 
satisfaction with a leader as a function of 
the former's needs and the latter's person- 
ality or ability to meet these needs. Both the 
needs and ability to satisfy these needs were 
viewed as a function of the authoritarian- 
equalitarian syndrome. The data indicated 
that a group of authoritarians in a situation 
would desire their leaders to behave in cer- 
tain ways, whereas equalitarians in the same 
situation would want different behaviors. 
Groups might be expected to vary, .ћеп, in 
satisfaction with a leader depending upon 
his authoritarianism. Eager and Smith 
(1952) and Haythorn, Couch, Haefner, 
Langham, and Carter (1956a) found that 
paper-and-pencil measures of authoritarian- 
ism related consistently with an individual’s 
overt behavior. Based on his study of au- 
thoritarianism and decision making, Vroom 
(1959) suggested the need to investigate 
the effects of the interaction of leader and 
follower characteristics. 

The theoretical discussion and the various 
studies support the following specific hypo- 
theses. The general hypothesis is, of course, 
that the more a leader satisfies the needs of 
his group members, the more effective group 
performance. 

Hypothesis I. The more a leader has à 
history of solving the everyday and common 
problems of his group members, the more 
effective group performance. 

Hypothesis II. The more a leader's be- 
havior satisfes the role expectations that 
group members have for a leader, the more 
effective group performance. 

Hypothesis ITI. The more a leader's at- | 
thoritarianism is similar to his group mem- 
bers, the more effective group performance. 
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The hypotheses were tested during a 
series of field studies covering a period of 
6 years, 1952 to 1957, and the research was 
conducted at Fort Benning, Georgia; Camp 
Atterbury, Indiana; Fort Lewis, Washing- 
ton; and Stead Air Force Base, Reno, 
Nevada. In each of the four studies, which 
used military populations and tasks, other 
hypotheses were tested and the findings are 
reported elsewhere. The attempt was made 
to replicate the findings of one study in 
those conducted later and to pursue new 
hypotheses. M. Dean Havron, president of 
Human Sciences Research, Incorporated, 
was the principal investigator in each study. 


Srupy A 


Observations of infantry rifle squads dur- 
ing field problems suggested, for example, 
that a leader’s stature and tone of voice bore 
little, if any, relationship to his control of 
the squad or to the squad’s performance. 
The significant variables appeared to be less 
obvious and more subtle. On the basis of 
discussions with noncommissioned officers 
and with one particular squad leader (SL) 
who had just been tested on a field problem 
the idea was developed of reciprocality of 
indulgences and its relationship to squad 
effectiveness. The sergeant explained that 
when back at the post he tried to make life 
easy for his men and he helped them in per- 
Sonal matters, Therefore, when he was on 
the "spot" in the field his men tried to per- 
form well so that the credit from superiors 
came to him as the SL responsible for their 
effectiveness. 

Hypotheses I and II were tested in Study 
A. Although the problem solving (PS) 
hypothesis evolved directly from the dis- 
cussions with noncommissioned officers, the 
tole discrepancy (RD) hypothesis handles 
more specific expectations or desires on the 
Part of group members for leader behavior. 
While the PS hypothesis is concerned with 
needs common to nearly all people, the RD 
hypothesis involves needs for leader be- 
havior which may vary considerably from 

, 916 group member to another and from опе 
&roup to another. 


Method 


Subjects. The Ss in Study A were the men of 
the 13 highest scoring and the 13 lowest scoring 
Army rifle squads of 63 squads tested in 1952 at 
Camp Atterbury, Indiana, on a criterion rifle 
squad field problem (RSFP). Criterion RSFP 
Scores reflected the performance of both the SL 
and squad men. However, there was no over- 
lapping in separate scores between the “effective” 
and “ineffective” squads for leader scores or group 
scores (Havron, Greer, & Galanter, 1952). At 
the time of the research, the Army infantry rifle 
squad consisted of nine men: a leader, an assistant, 
a two-man automatic rifle team, and five riflemen, 
As the basic unit of the infantry the squad closes 
with and attempts to overcome the enemy, 


Independent variables. The 5-item Problem 
Solving Index measured how much a man per- 
ceived his SL as generally helpful or as a problem 
solver, Each item was responded to in terms of 
five-step intervals. A squad man indicated his 
SL was a problem solver with a high rating. 
Examples of the items are: “Is your squad leader 
good at figuring out easy ways to do things 
when the squad has a detail?” “How often does 
your squad leader help the squad men out in 
personal matters?” 

The 13-item Role Discrepancy Index measured 
the extent to which an SL was perceived as be- 
having according to a man's ideal, For instance, 
squad men were asked: "How strictly should а 
squad leader control his squad?" “How far 
should a squad leader go in talking about his 
personal life with his men?" 

A man was first asked for each of the 13 
items how an SL should behave. The alternatives 
then presented were in terms of five steps, and 
ranged from such statements as "never" to 
"always," and from "none" to “а lot" After re- 
sponding to the thirteenth item, the same questions 
were asked again with the squad man responding 
in terms of his perception of his own SL's be- 
havior. The sum of discrepancies between actual 
and ideal responses was assumed to be an index 
of the extent to which an SL's role behaviors 
were not satisfying a squad man's needs. 


Performance criterion. The dependent variable 
was a squad's performance on a 6-hour, blank- 
firing, simulated combat infantry RSFP. The field 
problem was developed by Havron, Fay, and Mc- 
Grath (1952) working in close collaboration with 
military tacticians at the Infantry School, Fort 
Benning, Georgia. Considerable realism was in- 
troduced into the problem through noise and con- 
fusion produced by explosives and live resistance. 
The number of aggressors along a squad's route 
varied from one to six. During the RSFP the SL 
was the pivotal figure. Communications between 
the squad and umpires went through the SL, 
The pressure was often on the SL to make prompt 
decisions. АП leaders appeared to try hard to 
have their squads do well. Indeed, а squad's per- 
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formance was considered by the military to re- 
flect largely on the SL. 

Although technical knowledge contributed to 
performance, only a modest amount of competence 
was needed in order to perform well in many 
situations. Motivation to perform well, then, 
was critical. On the RSFP squad men could fail 
to carry out their functions without too much 
awareness by other squad men that they were 
falling down on the job. However, the inappro- 
priate behavior, such as not maintaining proper 
formation, not aiming rifle fire, and not using 
cover/concealment would be noted by umpires and 
reflected in the squad's final score. 

Ratings of squad performance were made by 
trained observers under specified circumstances 
and at certain times during the course of the 
RSFP. Most ratings pertained to group behavior 
without regard to particular individuals. How- 
ever, SL score and a squad score could be derived 
separately. Observers had a median reliability 
of .88 for interrater agreement for 20 randomly 
selected squads (Havron, Fay, & McGrath, 1952). 

Procedure. Effective and ineffective squads had 
to be selected after all of the 63 squads at Camp 
Atterbury were tested. A delay of 2 to 6 weeks 
occurred between field testing and the interviews 
in which the hypotheses were tested. Eight 
civilian interviewers spent between 1 and 2 hours 
with each of the squad men reading questions to 
the respondent and recording his verbal responses. 
Neither interviewers nor interviewees knew 
whether a squad man had been a member of an 
effective or ineffective squad. Problem solver 
and role discrepancy items appeared toward the 
end of the interview; additionally, the men were 
assured their anonymity would be preserved. 

The score on the PS index was the simple 
sum for the five items. A score on the RD 
index represented the summation of differences 
for each one of the 13 items between the posi- 
tion on the continuum for the desired SL be- 
havior and the position for the perceived SL be- 
havior. Chi square was used for both indexes 
to test the significance of the difference between 
the distributions of scores for all men in the 
effective groups and all men in the ineffective 
groups. À more rigorous statistical criterion re- 
quired that for both indexes the ż test be em- 
ployed to determine the significance of the differ- 
ence between the mean squad scores for effective 
and ineffective groups. The mean in these cases 
represented the mean of the average score for 
each of the 13 squads, 


Results 


Problem solving. The distributions of PS 
scores for men in effective and ineffective 
squads were grouped in three categories and 
the chi square test indicated the predicted 
difference was significant (x? = 8.94; 2 df, 


p < .02). Moreover, effective теп per- 
ceived their SL as more of a problem solver 
than ineffective men did on each of the five 
items in the PS index. For two of the items 
the difference was significant: “When your 
squad leader tries to get a ‘good deal’ for 
his men, how often is he successful?” 
(x? = 5.40; 1 df, p < .03) “When some 
men in your squad do something wrong, is 
your squad leader able to handle the situa- 
tion without taking it to the platoon ser- 
geant or platoon leader?” (x? = 8.57; 2 df, 
р < 02). The mean of the squad averages 
for members on the PS index in each squad 
in the ineffective group was 15.9 as com- 
pared with a mean of averages for effective 
squads of 17.7. For the ineffective squads 
the averages ranged from 10.5 to 21.0; for 
the effective squads the range was 15.5 to 
20.7. The standard deviation for the ineffec- 
tive group means was 2.6 as compared with 
1.5 for the effective group. The difference 
between the variances was not accepted as 
significant (F = 3.08; df = 12 and 12, p < 
.10). The difference between the means of 
the averages for the effective and ineffective 
groups was significant (t = 2.10; 24 df, 
p < .03, one-tailed test). 

Role discrepancy. Men differed widely in 
desired SL behavior. Generally, the distri- 
butions of responses to the 13 items were 
flat. The distribution of responses to each 
ideal leader item for men in effective squads 
was compared with the distribution for the 
ineffective squads and similarly for re- 
sponses concerning actual SL behavior. For 
only one of the 26 comparisons was there a 
significant chi square (p < .05). Therefore, 
it cannot be concluded that men of effective 
squads as compared with men of ineffective 
squads necessarily differ in their desires for 
SL behaviors or in how they perceive their 
squad leaders’ behaviors in the areas of the 
RD items. 

The method of obtaining discrepancy 
scores for the RD index presented a prob- 
lem because a person who gave a response 
on the ideal behavior position “3,” where 
“1 257 “зу” 14» and “5” were possible, 
could not have had a discrepancy greater 
than two points. The chi square test showed 
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no significant difference between effective or 
ineffective men in their response positions 
to the ideal behavior items. However, men 
of the effective group had an average poten- 
tial discrepancy score for each item of 3.10; 
for the ineffective group the average was 
3.01. A lower ceiling, then, was actually 
placed on the discrepancy scores for men 
of the ineffective squads. The distributions 
of scores for both the effective and ineffec- 
tive squad men on the RD index were 
skewed toward the direction of low dis- 
crepancy scores. The predicted difference 
between men of effective and ineffective 
squads was significant (x? = 778; 2 df, 
$ < .03). Individual analysis of the 13 RD 
items indicated that the mean scores for 
men of the effective squads were less than 
the mean scores for men of the ineffective 
squads in the case of 12 of the 13 items 
($ < .01). 

The mean of the averages for the RD 
scores for the 13 ineffective squads was 13.1, 
while it was 11.4 for the 13 effective squads, 
The standard deviations were 3.3 and 3:1; 
respectively, and the variances did not differ 
significantly. The range of average scores 
for the squads in the ineffective group was 
between 9.3 and 21.8; the range was be- 
tween 8.0 and 18.0 for squads in the effec- 
tive group. The difference between the mean 
average scores of the effective and the in- 
effective squads of 1.7 was nonsignificant 
(t — 1.35; 24 df, Ё < 10, one-tailed test). 

An adjustment was made in the scores 
Of the ineffective squads to control the 
effects of their lower ceiling for discrepancy 
Scores. In making the adjustment it was 
assumed that the discrepancy score would 
increase with a higher ceiling by an amount 
directly proportional to the ratio between 
the previous discrepancy scores and the po- 
tential for discrepancy. As a consequence, 
the adjusted difference between the mean 
average scores for the two groups was 2.1 
and yielded a # value of 1.65 which 
approaches significance (t = 1.71; 24 df, 


0 < 05, one-tailed test). 


Ѕтору B 


The Purpose of this study was to test 
Hypothesis III on the similarity between 


leader and men on authoritarianism and 
group effectiveness. The analyses were per- 
formed using already extant data which had 
been collected for other purposes (Havron, 
Fay, & McGrath, 1952). The results of 
Study A led to Hypothesis TIT. 


Method 


Subjects. One hundred 9-man Army infantry 
rifle squads were the Ss in Study В. Thirty-seven 
of these squads were tested in the late spring of 
1952 at Fort Benning, and the remaining 63 
squads went through the same RSFP that sum- 
mer at Camp Atterbury, Indiana, The 26 squads 
of Study A comprised part of the 63 squads tested 
at Camp Atterbury, 

Independent variables. The 16-item Equali- 
tarian Index (EI) developed by Havron, Fay, and 
McGrath (1952) was described as a measure of 
the extent to which one accepts the rules and 
mores of a system according to their merits. 
In an independent study Masling, Greer, and Gil- 
more (1955) gave eight psychologists а descrip- 
tion of the authoritarian and equalitarian per- 
sonalities based on the work of Sanford (1950). 
The judges were then asked to categorize the re- 
sponses to the items in the ET as either authori- 
tarian or equalitarian. There was 99% agreement 
in designations among the psychologists, and their 
consensus designations were in complete agree- 
ment with the original scoring procedure. Ex- 
amples of items in the index are: 

A leader: 

a. should always control his group sternly 
(Authoritarian [A] response) 

b. is to blame if anything goes wrong (A 
response) 

с. must always consider the wishes of his 
group (Equalitarian [E] response) 

If you have a problem you should: 

a. go to your superior or chaplain for ad- 
vice (A response) 

b. talk it over with your friends (E re- 
sponse) 

c. try to forget about it (A response) 

If you disagree with the leader of your group, 

you should: 

a. try to get him to see your side of it (E 
response) 

b. speak out with your own ideas (А re- 
sponse) 

c. follow anyway (A response) 

Performance criterion. The 6-hour, blank-firing, 
four-phase, Army infantry RSFP used in Study 
B as well as in Study A involved very specific 
ratings of leader and squad behaviors by trained 
observers. The total problem score included rat- 
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ings on leader and squad items; separate SL and 
squad scores could be derived. In the research 
reported by Havron, Fay, and McGrath (1952) 
the dependent variable was taken as the total 
score. A correlation of .66 was found between SL 
and squad scores for the 100 squads (Greer, 
1955). Squad scores were used as the dependent 
variable in Study В. Squad scores for the 100 
squads ranged from 41 to 86 with a mean of 
63.5 and standard deviation of 11.0. The median 
score was 64 with 100 as the highest score pos- 
sible. 


Procedure. On the day following the field prob- 
lem, each man completed an attitude question- 
naire which included the 16-item EI. The product- 
moment correlation was employed to determine 
the relationship between the attitude discrepancy 
on authoritarianism (ADA) for leader and men 
and squad performance. An ADA score repre- 
sented the difference between an SL's score on 
the EI and the mean EI score of his men. 


Results 


Leader and men relationship on authori- 
larianism. 'The mean EI score, a measure 
of authoritarianism or rather the lack of it, 
for the 100 SLs in Study B was 8.43; for 
the 800 squad men it was 7.58. Leaders 
were significantly less authoritarian than 
their men (# = 2.17; p « .05, two-tailed 
test). A nonsignificant correlation of .15 
was obtained for the 100 groups between an 
SL’s EI score and the mean EI score for his 
men. 


Authoritarianism and squad effectiveness. 
The SL’s EI score and the mean EI squad 
score were separately correlated with squad 
performance. In each case the obtained 
correlation was .13 and nonsignificant. The 
possibility was investigated that there might 
be a relationship between the general simi- 
larity on authoritarianism for all group 
members and squad performance, regard- 
less of group’s mean position on the 
attitude continuum. The mean EI score 
for the nine men in a squad was 
determined and the differences between each 
man’s score and the mean were summed. A 
significant correlation of —.21 was obtained 
between authoritarian heterogeneity in the 
group and squad performance, that is, in- 
effective performance was associated with 
men's dissimilarity in authoritarianism 
(p < .05, two-tailed test). Group hetero- 


geneity scores on authoritarianism and ADA 
scores had a significant correlation of .25 
(p < .05, two-tailed test). 


Leader and men attitude discrepancy and 
squad effectiveness. The attitude discrep- 
ancy as represented by the difference be- 
tween the SL’s score and the mean score for 
his men on authoritarianism yielded a signi- 
ficant correlation of —.22 with squad per- 
formance, that is, ineffective performance 
was associated with leader-men dissimilarity 
in authoritarianism (p < .05, one-tailed 
test). A significant correlation of —.18 
(p < .05, one-tailed test) was obtained for 
the 100 squads between the sum of the 
difference between each man's ЕТ score and 
the leader's ET score and field performance. 
Unless otherwise indicated the effect of atti- 
tude discrepancy has been based in these 
studies on mean group member authori- 
tarianism. The data were further analyzed 
to determine whether it made any difference 
for the SL to be either higher or lower in 
his EI score as compared with the mean EI 
score of his squad men. In those 65 cases 
where the SL was more equalitarian or less 
authoritarian than the squad, the correlation 
between the discrepancy and performance 
was —.25 and significant (p < .05, one- 
tailed test). The correlation was —.30 and 
significant (p < .05, one-tailed test) for the 
34 cases where the SL was less equalitarian 
than his men. For one squad there was no 
difference between the SL’s score and the 
mean score of his men. 


Correlations as a function of group's age. 
Men varied in the length of time they had 
been members of their squads. The Ss com- 
pleted a questionnaire (Greer, 1955) in 
which they responded to an item on how 
long they had been in their squads. The item 
was: “How long have you been a member 
of this squad? (а) one month or less, (b) 
one to three months, (c) three to six months, 
(d) six months to one year, (e) more than 
one year.” Responses of each of the nine 
men in a squad were averaged so that squads 
could be compared in terms of their "ages. 
The crude measure of a squad's age was 
used to separate the 100 squads into three 
fairly equal groups. This procedure esse ' 
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tially rank ordered these three groups of 
squads in terms of age, Analyses were con- 
ducted to determine what effect different 
squad ages would have on the size of the 
correlations for different aspects of authori- 
tarianism and squad effectiveness. For the 
31 squads where men had been together the 
longest the correlation between the leader’s 
EI score and the mean for his men was .16 
and nonsignificant. This correlation is 
hardly different from the one when all 100 
squads were considered. Table 1 presents 
additional results. There appears to be no 
consistent effect of age on squad mean or 
heterogeneity for EI scores and squad effec- 
tiveness. However, the size of the correla- 
tion between ADA and squad effectiveness 
increased as squads became older. A signi- 
ficant correlation of —.44 (p < .01, one- 
tailed test) was obtained between the sum of 
the differences between the SL’s EI score 
and the EI score for each of his men and 
field performance for the 31 squads in the 
oldest group. The data in Table 1 further 
indicate that neither the ADA correlation 
nor the squad heterogeneity correlation can 
be attributed to each other. 


The midscoring leader and men on author- 
iarianism and squad effectiveness. In a 
study of 37 medium bombardment aircrews 
Adams (1954) related a measure of authori- 
tarianism for the 5 officers in each crew of 
11 positions to the crew’s performance de- 
tived from squadron training reports. The 
Personalities of the officers were assumed to 
establish the extent of democratic or authori- 
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tarian climate іп a crew. There was по sig- 
nificant correlation for the total distribution, 
although there was a suggestion that equali- 
tarianism and effective performance were 
related. However, the highest performance 
ratings were received by those crews where 
officers were generally neither authoritarian 
nor equalitarian but fell in the middle of the 
continuum. A correlation ratio achieved 
significance at the .10 level. Explanations 
offered included the belief that officers who 
rated crew performance simply downgraded 
the democratic crews and that the nature of 
the tasks the crews faced prevented effective 
performance by democratic groups. 

A possibility not considered by Adams 
is that men who are neither too authoritarian 
nor too equalitarian are simply more effec- 
tive leaders. If the determining variable is 
the personality of the leader and not the 
reciprocality of indulgences between SL and 
men, then the analyses of data in Study B 
should indicate a curvilinear relationship 
between the authoritarianism of leaders and 
squad RSFP score. The attitude discrep- 
ancy score between an SL and his men on 
authoritarianism would seem to have a 
higher probability of being smaller when the 
SL is a midscorer. Hence, the ADA corre- 
lations in Study B might simply be a statis- 
tical artifact of the greater leadership effec- 
tiveness of the midscoring SLs. 

The correlation ratio was .33 for the non- 
linear relationship between the SL’s EI 
score and the squad’s performance for the 
100 squads. The ¢ value was 3.75, but there 


TABLE 1 


Propuct-MoMENT CORRELATIONS FOR VARIOUS ASPECTS OF AUTHORITARIANISM AND 


SQUAD EFFECTIVENESS AS 


A FUNCTION or GROUP AGE* 


Authoritarian АП Groups Youngest Groups | Middle Groups Oldest Groups 
Aspect (N — 100) (N = 34) (N = 35) (N = 31) 
Squad Mean 13 .07 .19 212 
Squad Heterogeneity —.21* —.36* —.10 —.19 
Leader-Men Attitude 
Discrepancy —.22** .09 —.25 —.46*%** 


M Scores on the Equalitarian Index used as measures of authoritarianism. 


Ф < .05, two-tailed test. 
AD < 105, 


хака 05, one-tailed test. 


+01, one-tailed test. 
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was no significant difference when testing 
for difference from linearity. If the mid- 
scoring SLs were most effective because of 
inherent leadership ability, then a nonlinear 
relationship should be higher when the cri- 
terion was based solely on those items of 
performance on which the SL was rated. 
This eta was .38 with a t value of 4.40, but 
the test for the difference from linearity 
was nonsignificant. Moreover, the two etas 
were hardly different in magnitude. 

The correlation ratios, although offering 
little conclusive evidence, were provocative 
enough for the matter to be pursued. A 
group of midscoring SLs on the EI was 
matched with a group of nonmidscoring 
leaders on their attitude discrepancies with 
squad men. According to the personality 
hypothesis the squad performance scores 
should be higher in the case of midscoring 
leaders. SLs scoring 8 and 9 on the EI 
(leaders X = 843) were taken as repre- 
senting the midscoring SLs. The remaining 


TABLE 2 


MEAN SQUAD PERFORMANCE, MEAN ATTITUDE Dis- 
CREPANCY, AND RANK DIFFERENCE CORRELATIONS 
FOR LEADER-MEN ATTITUDE DISCREPANCY AND 
SQUAD PERFORMANCE AS A FUNCTION OF 
LEADER EQUALITARIAN INDEX SCORE FOR 
THE 100 Sguaps ІХ Stupy В 


Leader 
Equali- Mean Mean Rank 
tarian Squad | Attitude | Difference 
Index Per- Discrep- | Correla- 
Score N | formance ancy tion 
3 1 44.0 2.2 = 
4 2 56.5 4.2 - 
б. n P 58.9 2.3 —.33 
6. | 12 62.0 1.6 —.07 
7 |10 67.0 0.9 —.10 
8 9 64.4 0.8 315 
9. "2t 67.5 1.7 — .48** 
10 | 11 63.2 2.4 „15 
11 |13 61.0 3.2 —.59** 
12 3 68.0 4.7 = 
13 4 62.3 5.17. = 
14 2 65.5 5.9 - 
Mean 8.4 63.5 2.2 


a Insufficient N. 
** p < .05, one-tailed test. 


70 cases were randomly ordered. The 30 
midscoring cases were ordered according to 
the size of the discrepancy between SL and. 
men on the EI. Each of the 30 cases was 
matched with the first one of the 70 cases 
where the size of the ADA was identical. 
Thus, 19 of the 30 cases were matched on 
this basis. The mean performance score for 
the 19 squads where the SL was a mid- 
scorer was 64.9, while for the other 19 
squads where the SL was a nonmidscorer 
the mean was 63.6. The ż value of .35 for 
the difference between the means was non- 
significant. The results did not offer any 
statistically significant support to an expla- 
nation which is based entirely on the sheer 
greater leadership ability of the midscorer 
without reference to the nature of his group. 

Additional analyses of the data of Study 
B are shown in Table 2. Although the cor- 
relation ratio obtained between leaders’ EI 
scores and squad performance could not be 
shown to differ significantly from linearity, 
the mean squad performance scores bear an 
empirically nonlinear relationship to SL au- 
thoritarianism. The mean squad perform- 
ance score was 66.7 for the 40 SLs falling 
in the middle of the distribution on El 
scores, The 60 extreme scoring SLs had a 
mean squad performance score of 61.3. The 
t value was 2.43 for the difference between 
these means (р < .02, two-tailed test). 


This result could be interpreted as offering j 


support for the personality hypothesis. 

The size of the sample in Study B and 
the restricted range for the EI scores per- 
mitted the determination of whether there 
was any relationship between leader-men 
ADA. and squad performance when SLs 
had identical EI scores. Statistically 518° 
nificant results would militate against а 
simple and complete explanation of the over” 


— 


all empirical relationship as based upon the E 


nature of the SL without reference to his 
group. Table 2 shows a significant rho for 


the SLs with EI scores at the mode in the | 


distribution and a significant rho for leaders 
with a score which fell toward one of the 
tails in the distribution. If the correlations 
in Table 2 are transformed to 2’s and aver- 
aged, the resulting average correlation 55 
—20. A correlation of —.20 when based on 
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an N of 88 would be significant (p < .05, 
one-tailed test). This procedure is, of 
course, questionable. Nevertheless, there 
was some statistically significant data, that 
is, the two significant rho's, to indicate the 
importance of the leader-men interrelation- 
ship on authoritarianism as a contributor to 
group performance. The findings do not 
permit any definitive statement on whether 
or how much the leader's authoritarianism 
without reference to his men contributes in 
this situation to group effectiveness. 

Several of the analyses performed with all 
cases in Study B and illustrated in Table 2 
were carried out just for those 31 squads 
where the men had been together the longest 
and where the ADA correlation was —.46. 
The eight squad leaders who scored 9 on 
the ET were considered the midscorers. The 
mean performance score for their squads 
was 74.1; for the other 23 squads the mean 
was 64.1. The difference between the means 
yielded a t value of 3.31 (р < .01, 29 df, 
two-tailed test). Eleven of the 31 SLs had 
scores falling between 3 and 8; their ADA 
rho correlation was —.75 (p < .01, one- 
tailed test). The eight SLs who scored 9 
had a nonsignificant rho of .06. The re- 
moval of one case would change the correla- 
tion to —.41 (p > .05). For the 12 SLs 
who scored between 10 and 13 the correla- 
tion was —.57 (p < .05, one-tailed test). 
These results can generally be interpreted to 
Support the theory of the reciprocality of 
indulgences, but the correlations are still 
Suspect because of the possibility that they 
are a consequence of a statistical artifact. 
However, the size of the correlations and 
the restricted ranges of the leaders’ EI 
Scores suggest that the correlation did not 
simply reflect a statistical artifact. 

The data indicated, then, that the overall 
correlation between ADA scores and RSFP 
Scores could not be attributed solely to an 
artifactual relationship based on the leader’s 
EI score. However, the possibility still ex- 
isted that the ADA correlation might be due 
to an artifactual relationship with the mean 
ЕТ scores of the men in a group. Table 3 
Presents data similar to Table 2, only this 
time the mean EI scores of the men were 
held constant. Bass (1960) pointed out that 


group heterogeneity on a variable will be 
forced to become smaller as the group mean 
approaches either end of the continuum. 
The data in Table 3 tend to be consistent 
with this expectation. Since heterogeneity 
on authoritarianism was found to be 
negatively correlated with performance, one 
might expect a positive relationship between 
extreme group means on authoritarianism 
and more effective performance. On the 
other hand, an hypothesis that midscorers 
on a measure of authoritarianism are more 
effective would lead to the expectation that 
groups with nonextreme means would per- 
form more effectively. The data in Table 3 
cannot be used to rule out the latter possi- 
bility, but, nevertheless, such a tendency is 
not here even sufficient to mask the ten- 
dency for groups with extreme mean EI 
scores to perform effectively. 

The 35 midscoring squads on mean group 
EI score had a mean field performance score 
of 61.8. A mean performance score of 64.4 
was obtained for the 65 extreme scoring 
squads on mean group EI score. The differ- 
ence between the mean performance scores 
was nonsignificant (t = 1.15; 98 df, p > 
-05, two-tailed test). 

The mean ADA scores do not appear to 
bear a relationship with any of the other 
results in Table 3 although there is a slight 
tendency for discrepancies to be greater for 
extreme scoring squads on the EI. The 
limited range of mean scores in Table 3 
perhaps accounted for the failure to obtain 
a more marked relationship. None of the 
six correlations was significant between 
ADA and performance with squad mean EI 
score held constant. The midpoint of a squad 
men mean range was used to determine each 
ADA score for that group. Since all six 
correlations were in the predicted direction, 
the Binomial test indicated a significant find- 
ing (р < .02, one-tailed test). The results 
in Tables 2 and 3 indicated that the overall 
correlation between ADA and RSFP сап- 
not be attributed to an artifactual relation- 
ship with either the leaders’ or men's mean 
EI scores. 

Leader's motivation, attitude discrepancy, 
and squad effectiveness. In Study B all men 
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TABLE 3 


MEAN SQUAD PERFORMANCE, MEAN SQUAD HETEROGENEITY, Mean ATTITUDE DISCREPANCY, AND RANK 
DIFFERENCE CORRELATIONS FOR LEADER-MEN ATTITUDE DISCREPANCY AND SQUAD PERFORMANCE 
AS A FUNCTION OF MEAN MEN EQUALITARIAN INDEX SCORE FOR THE 100 SQUADS IN STUDY B 


Mean Men Equalitarian Mean Squad | Mean Squad | Mean Attitude | Rank Difference 
Index Score* N Performance | Heterogeneity^| Discrepancy Correlation 
5.0-5.4 1 44.0 19.0 2.2 —° 

5.5-5.9 3 69.0 13.8 2.8 3 

6.0-6.4 9 65.6 17.3 1.7 —.17 
6.5-6.9 13 61.0 15.8 3.1 —.33 
7.0-7.4 23 62.6 18.2 2.2 —.11 
1.5-1.9 17 61.9 18.0 Teo —.17 
8.0-8.4 15 62.4 17.9 2.3 —.30 
8.5-8.9 11 68.9 17.0 1.8 —.40 
9.0-9.4 5 68.2 16.7 2.0 - 
9.5-9.9 3 62.3 14.0 2:3 = 
Mean 7.6 63.5 17.2 2.2 


a Midpoint of each range used in determining attitude discrepancy for rank difference correlation. 
b Sum of differences between each man's EI score and the mean EI score for all men in a squad. 


o Insufficient N. 


had been administered a measure of general 
Army adjustment. The purpose of the Gen- 
eral Army Adjustment Index (GAAI) 
(Havron, Fay, & McGrath, 1952) was to 
measure a person’s acceptance of the Army 
and his desire to comply with Army goals. 
One might infer that an SL’s motivation to 
do well on the RSFP would be reflected in 
his GAAI score. The correlation was .10 
and nonsignificant between SL scores on 
GAAI and squad performance. The 100 
squad cases were separated into generally 
equal thirds according to the leader’s СААТ 
score and correlations were determined be- 
tween ADA scores and RSFP scores. The 
product-moment correlation for the highest 
third (N = 36) of SLs on GAAT was a non- 
significant —.22; for the middle third 
(N = 37) it was a significant —.33 (p < 
:05); and for the lowest third (N = 27) the 
correlation of —.03 was nonsignificant. The 
top two-thirds (N = 73) of SLs on GAAI 
had a significant correlation of —.28 (p < 
05). There is, then, some evidence to sug- 
gest that the motivation of the leader for 
the task influences the degree of operation 
of the reciprocality of indulgences. 


Relationship with Study A. In Study A 
comparisons between 13 high scoring squads 


and 13 low scoring squads on the RSFP 
indicated that the SLs of effective squads 
were more apt to be perceived by their men 
as better problem solvers and more likely to 
meet their role expectations for an SL. 
The same 13 high scoring squads had a 
mean ADA score of 1.4; the 13 low scoring 
squads had a mean of 2.0. The difference 
between the means produced a ¢ value of 
2.79 (p < .01, 24 df, one-tailed test). For 
the 26 squads in Study A each of the three 
hypotheses received a confirmation. 

The rho correlation between the 26 squad 
scores on ADA and PS was —.08 and be- 
tween ADA and RD the correlation was .05. 
Both correlations were nonsignificant. How- _ 
ever, the rho correlation between squad 
scores оп PS and RD was —.70 and sig- 
nificant (p < .01, two-tailed test). These 
intercorrelations of independent variables 
strongly suggest that ADA is an independent 
type of indulgence from the indulgences ш- 
volved in PS and RD. Moreover, the cor- 
relation between squad PS and RD scores 
for the leaders indicated that as measured 
there was either a marked overlap in types 
of indulgences involved or that SLs who 
satisfied one type of need were, at least, 
perceived as satisfying the other. 
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Stupy С? 


Method 


Hypotheses I and II were again tested in a 
study (Havron, Lybrand, & Cohen, 1954) with 
Army infantry rifle squads in 1954 at Fort Lewis, 
Washington. One hundred and twelve 9-man 
Army infantry rifle squads were the Ss for this 
study. The PS and RD measures can be found 
in their entirety in Appendix II of "The Assess- 
ment and Prediction of Rifle Squad Effectiveness" 
(Havron et al, 1954). The PS variable was 
measured by 10 items either identical or similar 
to those used in Study A. Although the six items 
employed to measure the RD variable were all 
taken from the instrument used in Study A, their 
administration eliminated the necessity of return- 
ing to the item the second time to ask the re- 
spondent how he perceived the behavior of his SL. 
Additionally, the items were scored so that a 
high score indicated the SL was meeting role 
expectations, As an example, the first item in 
this instrument was : 


How does your squad leader compare with an 
ideal leader as to strictness? 


_а. He is much too strict (0 points) 
b. He is a little too strict (1 point) 
€. He is as strict as he should be (2 points) 
d. He is not quite strict enough (1 point) 
€. He is not nearly strict enough (0 points) 


In addition to the criterion field problem de- 
scribed in Study A scores on two new field prob- 
lems, a daylight live-firing problem and a night 
blank-firing problem, patterned after the daylight 
blank-firing problem, were combined to yield an 
overall comprehensive measure of a squad’s pro- 
ficiency on RSFPs. The composite score used as 
the dependent variable in this study represented 
Squad men performance and excluded those items 
for which the SL was explicitly rated. The Ss 
spent 8 hours over a 2-day period prior to field 
testing completing predictor instruments. The 
RSFPs took 20 hours and were administered 
during the 3 following days. Summated squad 
men scores for PS and RD were correlated with 
IS composite squad men scores on the field prob- 
lems, 


Results 


The product-moment correlation between 
the 112 squad scores on the PS measure 
and composite RSFP scores was .16 and 
significant (p < .05, one-tailed test). A sig- 
nificant correlation of .25 was also obtained 


RICE. 

"The writer served as a consultant to this 
Project with especial reference to the leadership 
area, 


between scores on the measure of RD and 
squad performance (p < .01, one-tailed 
test). 


Stupy р 


Study D was conducted at the United 
States Air Force Survival Training School 
at Stead Air Force Base near Reno, Nevada. 
Field research in the Sierra Nevada, in an 
area approximately midway between Truckee 
and Sierraville, California, in the late fall of 
1956 permitted the further testing of each of 
the three hypotheses with a somewhat differ- 
ent population and field problem. Although 
some details of Study D are available else- 
where (Greer, Pearson, & Havron, 19572, 
1957b, 1957c) the present report involves 
additional analyses of the data. 


Method 


Subjects. Sixty crews? in training at the 
United States Air Force Training School served 
as Ss. Each crew comprised six men ranging in 
some cases from airmen third class to lieutenant 
colonels. An officer always acted as the leader 
or aircraft commander (AC) for the crew. 
Almost all crews were members of the Strategic 
Air Command; at that time these men periodically 
received survival and evasion training at 
Stead Air Force Base. The 60 crews were tested 
over a period of 5 weeks, representing five differ- 
ent classes. Although the school had expected to 
train students in crews of six men for the five 
testing weekends, administrative problems actually 
resulted in training men the first two weekends in 
crews of 12 men. Since the field problem was to 
be given to 6-man crews, the 12-man crews had 
to be divided into 6-man crews for the first and 
second weekends, The highest ranking officer in a 
derived crew was designated AC. 


Independent variables. The measures used to 
test the hypotheses in Study D can be found in 
“Evasion and Survival Problems and the Predic- 
tion of Crew Performance: I. Predictor Instru- 
ments" (Greer et al, 1957b). Three of the PS 
items of Study A comprised the measure of the 
PS variable. Also, the RD hypothesis was tested 


з This number was based on a contractual obli- 
gation. However, a total of 76 crews were tested 
and the 16 crews were eliminated which failed 
to complete the greater part of the field problem. 
These 16 crews represented essentially the groups 
tested on the first weekend; this period served to 
establish the field testing procedures. 
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using eight of the items from Study A. However, 
the RD questions were phrased in the same man- 
ner as those in Study C, but scored in the same 
way as Study A. The PS and RD items were in- 
ierspersed among others in a self-administered 
questionnaire, The 8-item Authoritarian-Equali- 
tarian Scale (A-E scale) was used as the measure 
of authoritarianism. Sanford (1950) employed 
the A-E scale as an abridged version of the 
California F Scale (Adorno, Frenkel-Brunswick, 
Levinson, & Sanford, 1950) with a weighting in 
the direction of leadership attitudes. The A-E 
scale had a repeat reliability of .78 for 200 Ss, 
and scores on the A-E scale correlated .67 with 
scores on the F Scale for 130 college freshmen. 


criterion. The Crew Survival 
Capability Test (CREWSCAT) (Greer et al., 
1957c) is a 6-hour, daylight field problem de- 
yeloped to measure the potential effectiveness of 
a crew to cope with problems faced on downing 
behind enemy lines. The same general principles 
were employed in the development of this criterion 
as for the earlier studies. However considerably 
more attention was paid to determining the rela- 
tive importance of various parts of the test for 
weighting purposes. First, an evasion and survival 
terminology and definitions were developed; then, 
18 critical areas were delineated, for example, 
navigation, food, security from enemy awareness. 
A downed crew’s goal is to get back to friendly 
lines without suffering death or capture to any of 
its members. Theoretically, one could derive a 
weighting system through simply counting the 
number of times that a crew met disaster due to 
each of the 18 critical areas. However, by its 
very nature one finds great difficulty in getting 
reliable data on the cause of disaster; there may 
be no survivors to supply the information. 

A probabilistic model was developed as a basis 
for differential weightings for the various critical 
areas. Two assumptions were made: problems in 
critical areas occur in evasion and survival situa- 
tions with different frequencies, and critical areas 
vary in the likelihood that their relevant problems 
will cause disaster. Therefore, a critical area's 
disaster-producing probability is a function of the 
number of times that a critical area is encountered 
in evasion and survival experiences and the proba- 
bility of disaster resulting each time it is met. 
The recurrence value of a critical area was deter- 
mined from the literature, and the criticality value 
was derived from expert judgments. The relative 
disaster-producing probabilities or impediment 
value of a critical area was obtained through the 
use of the recurrence and criticality values for all 
critical areas. The number of points a particular 
problem situation or item was worth їп the 
CREWSCAT was determined as a function of its 
relative importance among other items belonging 
to the same critical area and of that critical area’s 
relative importance among all the other critical 


Performance 


areas. Reliability data on recurrence and criticality 
values, item weighting, specific results, and prob- 
Jems associated with this technique were described 
and discussed by Greer et al. (1957a). In Study D 
аз in the other studies the nature of the field 
problem was developed independently of psy- 
chological and interpersonal hypotheses on in- 
dividual and group requirements for effective per- 
formance. 

The CREWSCAT starts with the crew 
assembled at a rendezvous point and progresses 
with their movement to a meeting with a par- 
tisan. The crew is directed to a camp site; after 
relevant activity there, they move to a recovery 
area, and finally cross a border. Problem support 
personnel include communications men, umpires, 
friendly sentry, and aggressor forces. School ex- 
perts deemed the CREWSCAT a realistic test, 
and student comments, both oral and written, re- 
flected the same opinion. 

CREWSCAT scores for the 60 crews ranged 
from 34-80 with a mean of 61.2 and a standard 
deviation of 10.2. One hundred was the highest 
possible score. In preparation for a study on the 
relationship between various independent variables 
and the continuance of effective performance under 
conditions of increasing stress the CREWSCAT 
problem form was keyed differently and new 
scores were obtained. Those field test items in 
the problem were excluded which involved sheer 
technical knowledge and did not require the ex- 
penditure of physical energy. Additionally, sub- 
scores were then derived for the first half of the 
CREWSCAT and for the second half. These 
scores correlated .66 and provide one basis for 
judging the overall reliability of CREWSCAT 
scores used in this research. 


Procedure, The school’s course lasted 17 days 
of which the first 7 days were spent in formal 
training at Stead. From the eighth day through 
the twelfth day men remained at a static camp 
in the Sierra Nevada where they lived under 
survival conditions. On the twelfth day the men 
started a 3-day trek and upon its completion re- 
turned to Stead. Study D was tailored to this 
schedule. On the second evening of the student's 
stay at the school he completed among other items 
the A-E scale. Ss tested on the CREWSCAT 
completed the PS and RD items the afternoon ог 
evening of the day before they were tested on 
the CREWSCAT. 

On the fourth or fifth day in the field Ss left 
their static camp to trek to the research test area. 
Crews had been requested to reach the test area by 
3 рм but most crews did not arrive until 5 PM; 
and some did not reach the test area until 8 PM 
The trek from the static camp was extremely 
arduous, requiring the men to go over a snow- 
covered mountain while carrying packs weighing 
more than 40 pounds, Ss walked between 5 to 10 
miles, depending on their abilities to navigate. 
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When a crew finally arrived at the test area, the 
men were tired, hungry, thirsty, and extremely 
irritable; some men had collapsed on the way 
and required evacuation. 

In administering the forms the investigator 
faced quite a challenge in Promoting among crew 
members even a minimally receptive attitude to- 
ward the project. This situation was hardly im- 
proved when the Ss found that they would be 
aroused commencing at 4:30 a.m. the following 
morning so that they could take down their tepees, 
break camp, and pack to be prepared to start the 
CREWSCAT as early as 6:30 a.m. 

Two courses each 2 miles in length started at 
the test area, Four crews were tested on a course 
each day with crews starting the problem about 
three-quarters of an hour apart so that they would 
not interfere with each other’s activities. Scenario 
personnel remained in their positions and handled 
in the same manner each of the four crews. The 
school provided their most qualified instructors to 
act as umpires. An umpire remained with a crew 
and rated very specific crew behaviors in terms 
of two or three categories. Fortunately, no snow 
fell on the testing days, although the temperature 
occasionally dropped to —20° at night. 

The product-moment correlation was used to de- 
termine the relationship between the scores for the 
various independent variables and performance on 
the CREWSCAT. In testing Hypothesis III on 
attitude discrepancy the additional statistical pro- 
cedure of Study B was followed. Besides deriving 
the discrepancy between leader and men as a 
function of the difference between the leader's 
Score and the mean for his men on authoritarian- 
ism, the discrepancy was also determined through 
summing the absolute difference between each 
man's score and the score of the leader. 

, In Study B the correlation between the ADA 
creased as a function of squad "integrality" or 
time that the men had been together in the same 
group. The 60 crews in Study D were separated 
into integral and nonintegral crews. The 21 crews 
їп the integral group had three or more members 
in the same.crew back at the home base. However, 
members in some crews in the integral group had 
known each other for an average of only one 
month, Nevertheless, this criterion of integrality 
appeared to be the best one with the available data. 


Results 


Problem solving and role discrepancy. 
Summated crew scores for the AC asa prob- 
lem solver correlated —.15 with crew per- 
formance. The correlation was .09 between 


_ Scores on the RD variable and CREWSCAT 


Performance. Both correlations were non- 
Significant and in the direction opposite to 
E that hypothesized. 
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Leader and men comparison on authori- 
tarianism. The mean A-E score for the 60 
ACs in Study D was 24.1; for the 300 
airmen it was 26.1. Leaders were signifi- 
cantly less authoritarian than their men (t= 
2.91, p < .01). A nonsignificant correlation 
of —.01 was obtained for the 60 groups be- 
tween an AC's A-E score and the mean A-E 
score of his men. In Study D, 57 AC's 
completed the Shipley intelligence test 
(Shipley, 1940) as well as the A-E scale. 
However, the intelligence test was adminis- 
tered as a power test rather than a timed 
test. The correlation between the A-E scores 
and the subtest of reasoning was .05; for 
the vocabulary subtest —.07; and for the 
composite .01. If the reliability and validity 
of the two measures employed are granted, 
these nonsignificant correlations are at vari- 
ance with other findings (Titus & Hollander, 
1957). Perhaps, in some military officer 
populations there may be a slight tendency 
for some of the more intelligent to be au- | 
thoritarian as compared with a general pop- 
ulation tendency for authoritarianism and 
intelligence to be negatively related. 


Authoritarianism and crew effectiveness. 
No statistically significant relationship was 
found between either the leader’s A-E score 
or the crew’s mean A-E score and crew per- 
formance. The correlations were .14 and 
.03, respectively. Moreover, the correlation 
between crew heterogeneity on authoritarian- 
ism and performance was —.07 and non- 
significant. Group heterogeneity scores on 
authoritarianism and ADA scores had a sig- 
nificant correlation of .37 (p < .01, two- 
tailed test). 

Leader and men attitude discrepancy and 
crew effectiveness. The sum of the absolute 
difference between each crew member’s A-E 
score and that of his AC correlated —.22 
with crew performance, that is, ineffective 
performance was associated with crew mem- 
bers’ dissimilarity with the AC on authori- 
tarianism. A correlation of —.27 was ob- 
tained for the difference between the mear 
crew member A-E score and the leader’s 
score and CREWSCAT performance. Both 
of these correlations were significant (5 < 
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TABLE 4 


Рворост-Момемт CORRELATIONS FOR VARIOUS 

AsPECTS OF AUTHORITARIANISM AND CREW 

EFFECTIVENESS AS A FUNCTION OF THE 
INTEGRALITY OF GROUPS* 


Non- 
EN integral | Integral 
Authoritarian Groups | Groups | Groups 
Aspect 


(N = 60)(N = 39)(N = 21) 


Crew Hetero- 


geneity —.07 —.03 —.15 
Leader-Men 

Attitude 

Discrepancy —.27** | —.26 —.31 


* Scores on the A-E scale used as measures of authoritarianism, 
эр < .05, one-tailed test. 


05, one-tailed test). For those 23 crews 
where mean crew member A-E scores were 
lower than the AC's the correlation between 
ADA. score, based on the difference between 
mean crew score and leader's score, and 
CREWSCAT scores was —.22 (p > .05). 
Where the 36 mean crew member scores 
were higher than the AC’s the correlation 
was —.33 and significant (p < .05, one- 
tailed test). For one crew there was no 
difference between the AC’s A-E score and 
the mean score for his men. 


Correlations as a function of crew in- 
tegrality. Integral groups, as defined earlier, 
are those school crews where three or more 
members belonged to the same crew at the 
home base. Table 4 presents data on the 
effect of integrality on the size of the corre- 
lations for crew heterogeneity and ADA and 
CREWSCAT performance. The results are 
generally inconclusive but they do indicate 
that the ADA correlation cannot be attri- 
buted to a correlation between crew hetero- 


4 Of the 76 crews tested 16 were excluded in the 
original study (Greer et al, 1957a, 1957b, 1957c) 
from further analysis because of unreliable cri- 
terion scores. Nevertheless, analyses based on 
all 76 crews now indicated a correlation of —32 
between leader-men ADA and CREWSCAT 
scores when mean A-E scores were used and —28 
when the sum of A-E score differences with the 
leader was employed. Both correlations were sig- 
nificant (p < .05, one-tailed test). 


geneity on authoritarianism and field per- 
formance. 

The midscoring leader and members on 
authoritarianism and crew effectiveness. A 
curvilinear relationship between a leader’s 
authoritarianism and group performance was 
suggested by analyses in Study B. The com- 
parable eta for Study D was .66 with a t 
value of 9.43, and the test for difference 
from linearity produced a significant chi 
square (x? = 36.75; 9 df, p < .01). This 
finding confirmed what Table 5 suggests. 
In Study D the mean performance score for 
crews of the 18 midscoring leaders on au- 
thoritarianism was 64.3. For the 42 extreme 
scoring leaders, 21 on each side of the mid- 
scorers, the mean CREWSCAT score was 
59.8. A t value of 1.73 was obtained (5 < 
.05, one-tailed test). 

For the 21 ACs scoring from 11 to 22 on 
the A-E scale the ADA had a significant rho 
correlation with crew performance of —.39 
(p < .05, one-tailed test). A significant 
rho correlation of —.48 (p < .05, one-tailed 
test) was found for the 18 ACs scoring 
from 23 to 25 on the A-E scale. On the 
other hand, the 21 leaders scoring from 26 
to 37 yielded a nonsignificant correlation be- 
tween ADA and group performance of —.12. 
The possibility that the two significant cor- 
relations are statistical artifacts was not 
rigidly controlled. Additional analysis for 


TABLE 5 


MEAN CREW PERFORMANCE AND MEAN ATTITUDE 
DISCREPANCY ON AUTHORITARIANISM AS A FUNC- 
TION OF LEADER A-E SCORE FOR 60 CREWS IN 


Stupy D 
Mean Mean 
Leader Crew Attitude 

A-E Score N | Performance | Discrepancy 
11-20 10 52.4 9.1 
21-22 11 65.6 4.3 
23-24 12 64.7 2.4 
25 6 63.7 3:3 
26-27 8 58.8 1: 
28-37 13 61.2 5.2 
Mean 24.1 61.2 4.6 


d 
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one of the correlations eliminated this 
criticism. The 18 AC's scoring between 23 
and 25 on the A-E scale were assumed for 
the purpose of the present analysis to have 
all scored 24. Leader-men attitude dis- 
crepancy for these 18 crews was then, 
the difference between the assumed leader 
score of 24 for each and the actual mean 
A-E score for the crew members. The 
new ADA scores yielded a rho correlation 
of —.52 (p < .05, one-tailed test) with 
CREWSCAT scores; this correlation can- 
not be attributed to variation in leader A-E 
scores. 

Results in Table 6 again show that as the 
group mean on authoritarianism becomes 
more extreme, group performance improves. 
The 18 midscoring crews on mean group 
А-Е score had a mean CREWSCAT score 
of 56.8. A mean CREWSCAT score of 63.0 
was obtained for the 42 extreme scoring 
crews on mean group A-E scores. The 
difference between the mean CREWSCAT 
scores was significant (2 = 2.26; 58 df, p 
X .05, two-tailed test). But this time mean 
crew heterogeneity on authoritarianism 
showed no evidence of decreasing as mean 


TABLE 6 


MEAN Crew PERFORMANCE, MEAN CREW HETERO- 
GENEITY, AND MEAN ATTITUDE DISCREPANCY AS 


A Function or MEAN MEN A-E SCORE For THE 
60 Crews Іх Stupy D 
Mean Mean Mean Mean 
Men Crew Crew Attitude 
A-E Scale Perform- | Hetero- | Discrep- 
Score N ance geneity* ancy 
20.0-21.9 | 2 65.5 35.5 3.8 
22.0-22.9 4 66.8 23.0 3.5 
23.0-23.9 | 10 62.7 20.9 3.2 
24.0-24.9 6 58.3 20.7 5.4 
| 250259 7| 577 18.6 2.6 
26.0-26.9 | 5 57.0 22.4 4.5 
27.0-27.9 | 9 59.4 26.0 4.8 
28.0-28.9 | 10| 62.0 23.7 5.1 
29.0-29.9 | 3 66.0 20.3 7.1 
30.0.32.9 | 4 63.3 24.3 6.2 
Mean 25 9 61.2 22.8 4.6 


] 5 Sum of differences between each man's A-E score and the 
nean A-E score for all men in a crew. 


group scores became more extreme. More- 
over, crew heterogeneity on authoritarianism 
failed to relate significantly with group per- 
formance in Study D as it had in Study B. 

Crews may have differed in conscien- 
tiousness during the research. Consequently, 
a lack of involvement might be reflected in 
the manner of completing questionnaires and 
in performance on the CREWSCAT. The 
men in noninvolved crews might be expected 
to complete the A-E scale in a random 
fashion. This reaction would increase the 
likelihood that such a crew would have a 
mean A-E score falling toward the middle 
of the continuum. Crews with means to- 
ward the extreme might more likely repre- 
sent sincere and conscientious effort in re- 
sponding to the questionnaire and these 
men might be expected to perform to ca- 
pacity on the CREWSCAT. Regardless of 
the value of this hypothesis, crew hetero- 
geneity on authoritarianism did not in Study 
D prove a sufficient basis for accounting for 
the better performance of crews whose au- 
thoritarian means fell toward the extremes. 

Four rank difference correlations with 
mean crew A-E scores artificially held con- 
stant were obtained between ADA and crew 
performance. The 20 crews with means 
between 22.0 and 24.9 were for this analysis 
all considered to have a mean of 23.0. A 
significant correlation of —.52 (p < .05, 
one-tailed test; rho: = .53) was obtained 
between the new ADA scores for these 20 
crews and performance. For the 18 crews 
whose A-E means fell between 24.0 and 
26.9, and 25.0 was taken as the mean for 
all, a significant ADA correlation of —.61 
(p < .01, one-tailed test) was found. How- 
ever, the 21 crews where the means were 
distributed from 25.0 to 27.9, and 26.0 
was taken as the mean for all, there was a 
nonsignificant correlation of .10. Again, for 
the 19 crews whose A-E means ranged from 
27.0 to 28.9, and the mean for all was taken 
as 28.0, a nonsignificant correlation of .11 
was obtained between ADA and perform- 
ance. Nevertheless, the data demonstrated 
that the ADA correlation could not be 
attributed to an artifactual relationship with 
mean group authoritarianism. 
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Leader nonconforming aggressiveness, at- 
titude discrepancy, and crew effectiveness. 
The suggestion in Study B that the effect 
of the ADA variable on group performance 
can be influenced by other characteristics of 
the leader prompted further analysis of the 
data in Study D. One of the measures in 
Torrance’s Life Experience Inventory ad- 
ministered to Ss at Stead was the Noncon- 
forming Aggressiveness Index (NAI) 
(Torrance & Ziller, 1957). In responding to 
this index the 5 indicated his past socially 
nonsanctioned behaviors by responding to 
questions on such matters as playing hookey, 
cheating, opposing teachers and parents, and 
smoking under 12 years. It could be inferred 
that leaders high in nonconforming aggres- 
siveness, regardless of their similarity on 
authoritarianism with their men, would not 
be as involved in the performance of their 
crews on the CREWSCAT as other leaders. 
Consequently, for high scoring leaders on 
nonconforming aggressiveness the effect of 
the ADA variable might be masked. 

The average score for the 60 ACs on non- 
conforming aggressiveness was 7.2. The 
product-moment correlation between AC’s 
scores on nonconforming aggressiveness and 
crew performance was —.08 and nonsignifi- 
cant. Twenty-one ACs scored between 1 
and 5; 21 scored between 6 and 8; and 18 
leaders scored from 9 to 15 on the NAI. 
The ADA rho correlation of —.41 with 
crew performance was significant for the 
group of 21 low scoring ACs on the NAI 
(p < .05, one-tailed test). For the mid- 
scoring group a significant rho correlation 
of —.38 was obtained (p < .05, one tailed 
test). A significant product-moment correla- 
tion, then, of —.46 (p < .01, one-tailed test) 
was obtained between ADA and crew per- 
formance for the 42 ACs who scored from 
1 to 8 or low and moderate on the NAI. 
However, for that group of 18 ACs scoring 
the highest in nonconforming aggressive- 
ness there was a nonsignificant rho correla- 
tion of —.12 between ADA scores and 
CREWSCAT scores. 


Leader-men exchange of sociometric re- 
sponses and crew effectiveness. Socio- 
metric responses given by the airmen to 
their AC and the relationship of these re- 


sponses to crew effectiveness can be ex- 
amined in view of the inconclusive results in 
Study D for the PS and RD variables. 
Although  sociometric data were also 
collected in the other studies, they were not 
used as a vehicle to test the hypotheses of 
this research since it seemed difficult at that 
time to infer accurately the determinants of 
the responses and their relevance here to the 
theory on the reciprocality of indulgence. 
However, there were several reasons for be- 
lieving that responses to the PS and RD 
measures in Study D were similar to the 
sociometric responses given by the men to 
their leaders. 

In Study D all crew members spent 4 or 5 
stressful wintry days in the High Sierra 
before testing. For an AC to get his crew 
to follow school rules and reach school ob- 
jectives he had to be a taskmaster. Conse- 
quently, many men on responding to the 
PS and RD measures may have simply rated 
those ACs favorable who were not task- 
masters. This tendency may have been com- 
pounded with the fact that some members 
of almost all crews reported to the investi- 
gator that they had little, if any, basis in 
experience to respond to the PS and RD 
items. Therefore, it seems quite possible 
that many responses to these items may have 
been on the same order as sociometric re- 
sponses. 

Three sociometric-type items were fe 
sponded to by all Ss in Study D at the same 
session in the field when they answered 
the PS and RD questions. The nature of 
the sociometric questions employed was 
based partially on the technique developed 
by Gardner and Thompson (1956). In the 
present situation the sociometric items were 
self-administered with a person placing his 
responses on a scale “one” to “nine.” The 
sum of crew member responses on the socio- 
metric item on friendship for the 60 ACS 
correlated —.14 with CREWSCAT scores. 
For the "generally helpful on the course 
item the correlation was —.19. The sum of 
the degrees of nomination given to the 
leader by his men on “usefulness as ait- 
craft commander" correlated —.17 wit 
crew performance. Although none of the 
correlations was significant, there was a con- 
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sistent negative relationship between socio- 
metric value for the AC and crew perform- 
ance on the CREWSCAT. 

The sociometric items used in Study B 
were derived through pretests and discus- 
sions with Korean returnees concerning im- 
portant aspects of interpersonal relation- 
ships. Each S in Study B indicated choice, 
rejection, or indifference to all other squad 
men in his squad for the 10 items, The 
sociometric battery can be found in “Small 
Unit Effectiveness” (Havron, Fay, & Mc- 
Grath, 1952). Several of the items are: 


"If you were told to pick the men whom you 
wanted to live with in a tent or barracks, what 
man (or men) would you choose and what man 
(or men) would you not choose?" 


"If you had to pull four hours guard duty at an 
ammo dump far from camp, at night, what man 
would you want to be with you, and what man (or 
men) would you not want to be with you?" 

The sociometric questionnaire was ad- 
ministered to all squad men in Study B 
shortly before the RSFP.  Sociometric 
Scores were based on the summation of 
choices or rejections to each of the 10 items 
given by all squad men for their SL. The 
Sociometric results were based on data for 
99 squads; sociometric data for one squad 
was misplaced. A correlation of .08 was ob- 
tained between the total number of choices 
an SL received and RSFP scores. For the 
number of rejections an SL received and 
Squad performance the correlation was .11. 
The correlation with squad performance was 
01 when the total number of rejections given 
to the SL within his squad was subtracted 
from the total number of choices he received 
from the same people. All product-moment 
Correlations were nonsignificant. However, 
When choices and rejections were added the 
Correlation between the sociometric re- 
Sponses received by the SL and RSFP 
Scores was .19 (5 < .05, one-tailed test). 
With the use of the biserial correlation to 
compensate for the restricted range, the 
relationship was determined between the 
number of choices an SL received for each 
of the 10 sociometric items and squad per- 
formance (Greer, 1955). A correlation of 
36 (^ < .01, one-tailed test) was found for 

€ item concerning whether the SL would 
be chosen to cover the respondent if the 


squad man led an advance through an enemy 
town. Correlations for the other nine items 
were nonsignificant. 

Since the simple number of sociometric 
responses, regardless of whether they were 
choices or rejections, was significantly re- 
lated to squad effectiveness, the mean num- 
ber of sociometric responses given to the 
SLs for the 13 effective and 13 ineffective 
squads of Study A were determined and 
compared. The effective group had a mean 
number of 67.1 sociometric responses given 
to the SL in each squad as compared with 
64.4 for the ineffective squads. The differ- 
ence was nonsignificant (t = .75). More- 
over, neither the mean squad number of 
choices nor rejections for the SL differed 
significantly between effective and ineffective 
groups. 

In Study C (Havron et al., 1954) nine 
sociometric items either identical or similar 
to those of Study B were administered 
in the same manner to men in the 112 
squads. The number of sociometric re- 
sponses received by the SL, regardless of 
whether they were choices or rejections, 
correlated .03 with squad performance and 
was nonsignificant. 

Data in the various studies suggest that 
there was some tendency for the leader to 
receive sociometric choices when he placed 
on his men a minimum of task demands. 
Analysis of the relationship between the 
sociometric responses a leader gives to his 
men and group effectiveness may indicate 
another basis for the sociometric response. 
For Ss in Study B a correlation of .21 was 
found for the total number of choices an 
SL gave to his men and squad performance. 
For rejections given by the SL to his men 
the correlation was —.22. When rejections 
were subtracted from choices given by an 
SL, the correlation with squad performance 
was .26. The slightly higher correlation in 
this last case may be attributed to the forced 
correlation which exists between choices and 
rejections. The three correlations were each 
significant (р < .05). However, when an 
SL's choices and rejections for his men were 
simply added, the total number yielded а 
nonsignificant correlation of .05 with squad 
performance. For Ss in Study C the num- 
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ber of positive sociometric choices given by 
the leader correlated .10 with squad per- 
formance and was nonsignificant. 

In Study D the AC's response for his men 
on the sociometric friendship item at Stead 
correlated —.09 with crew performance 
(N = 57). For the friendship item in the 
feld the correlation was .06 (N = 60). 
The AC’s sociometric responses for his men 
on the helpfulness item correlated .11 with 
the CREWSCAT (N = 59), and .17 for 
the “usefulness as aircraft commander” item 
(N = 60). In several cases the AC had 
failed to complete all the items. Although 
none of the correlations was significant be- 
tween the AC’s sociometric responses for 
his men and crew performance, the change 
in direction and increasing size of the cor- 
relations suggested that as the AC came to 
know his men better, his sociometric re- 
sponses correlated more positively with crew 
performance, and that the correlations in- 
creased in magnitude as the sociometric cri- 
terion became more directly related to tech- 
nical competence. Moreover, the data may 
be interpreted to suggest that the ACs in 
Study D were not as personally involved in 
the performance of their men as the SLs in 
Study B where the correlations were larger 
and significant. Generally, then, the results 
suggest that one basis for a leader’s socio- 
metric responses for his men is their com- 
petence and/or willingness to “put out” for 
him. 

Further analyses of. problem. solving and 
role discrepancy data їп Study D. Crew 
scores on the RD measure correlated —.57 
with the sociometric responses the crew gave 
their leaders in the field as friend. This cor- 
relation suggested for Study D a marked 
relationship between liking an AC and per- 
ceiving his behavior as matching that which 
is desired in an ideal leader. This finding 
suggests that sociometric feelings toward the 
AC produced a halo effect and influenced 
their responses to PS and RD questions. 

When the RD scores for the 21 integral 
crews were related to performance, the cor- 
relation was .00 as compared with .09 for 
all 60 crews. The correlation between RD 
scores and crew performance with friend- 
ship partialed out or held constant was —.10 
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and nonsignificant. Nevertheless, in taking 
into account integrality and partialing out 
the friendship component the correlation be- 
came more consistent with expectation. 

In Study A there was some indication of a 
greater variability in mean PS scores for the 
13 ineffective squads as compared with the 
13 effective squads. This finding suggested 
that among groups where the leaders receive 
high mean PS scores there was a greater 
variability in criterion performance scores 
than among groups where the leaders receive 
a low mean PS score. The crews in Study 
D were rank ordered in terms of the degree 
to which the AC was perceived as a problem 
solver. Twenty-two crews were designated 
the “high problem solver group”; and 21 
crews the “low problem solver group.” The 
mean score for the high problem solver 
group on the CREWSCAT was 58.6, while 
the mean score for the low problem solver 
group was 63.9. The standard deviation for 
the high problem solver group was 11.9 and 
7.6 for the low problem solver group. A 
significant F value of 2.43 was obtained for 
the difference between the two variances (df 
= 21 and 20, p < .05). These data support 
the possibility of two general explanations 
for AC’s receiving high PS scores. In the 
one case, high PS scores may have reflected 
the leader’s past indulgences to his men 
which led to effective performance. On the 
other hand, the second explanation, which 
has to account for much of the final result 
in Study D since there was an overall nega- 
tive correlation between PS scores and crew 
performance, is that some men rated a leader 
who was not a taskmaster high as a problem 
solver. As a consequence, in such cases, 
there may have been a lower criterion score. 

Intercorrelations among independent vari- 
ables. Again, as in Study D, the ADA 
variable correlated nonsignificantly with PS 
and RD; the correlation for the 60 cases 11 
Study D were, respectively, .19 and 07. 
Crew PS and RD scores yielded а зї 
fcant correlation of —33 (5 < 05), that 


is, the more a group perceived the leader aS 


a problem solver the less he was perceive 
as deviating from role expectations. 

In Table 7 are presented the intercorrela- 
tions among the three independent variables 
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and various sociometric responses given to 
the leader in Studies B and D. 

In each study there was a significant posi- 
tive correlation between sociometric choices 
and PS scores and a significant negative 
correlation with RD scores. Conversely, 
sociometric rejections correlated positively 
with RD scores and negatively with PS 
scores, Scores on the ADA variable did not 
correlate significantly with the sociometric 
responses in either Study B or D. These 
results further suggest that ADA involves 
a rather independent type of indulgence 
from PS and RD. Consequently, the data 
indicated that it was not unreasonable to 
find a significant correlation between ADA 
and crew performance in Study D and no 
significant correlations with the criterion for 
the other independent variables. 


Discussion 


Three hypotheses were derived from the 
theory of the reciprocality of indulgences 
and tested in four studies. The hypotheses 
generally state that the more a leader in- 


TABLE 7 


CORRELATIONS BETWEEN INDEPENDENT VARIABLES 
AND SOCIOMETRIC RESPONSES RECEIVED 
BY LEADERS 


Sociometric Responses 
Independent 
Variables 
E Degree 
Choices Rejections Choice* 
Problem 
Solving .45* | —.44* „314% 
Role Dis- 
crepancy —.585*** 1575***| — 57a 
Attitude Dis- 
crepancy on | 
Authoritari- 
anism — .06* -119 .08* 


5, Froduct-moment correlation based on 60 groups and meas- 
ures in Study D. Sociometric item on friend їп Reld used. 
Rho correlation based on 26 groups in Study A and meas- 
5198 in Studies А and В. Responses to all sociometric items in 
tudy B used, 
o Eroduct-moment correlation based on 99 groups and meas- 
useda? Study В. Responses to all sociometric items in Study B 
2. < .05, two-tai я 
*05, two-tailed test. 
“Sd € 01; two-tailed test. 


dulges his men, the more effectively they 
perform on a field problem. The PS and 
RD hypotheses were tested in Studies A, 
C, and D. The hypothesis on the leader- 
men ADA was tested in Studies B and D. 
The PS and RD hypotheses received sta- 
tistical support in Studies A and C, but not 
in Study D. The hypothesis on ADA was 
confirmed in both Studies В and D, using 
different measures of authoritarianism, The 
failure to receive statistical confirmation for 
two of the three hypotheses in Study D can 
be examined in the light of conditions 
present in that study. One of the pitfalls 
earlier enumerated may have been respon- 
sible, that is, the lack of relevant experience 
leader and men had had with one another 
prior to the CREWSCAT field problem. 


Problem Solving and Role Discrepancy 


Scores on PS and RD measures were ob- 
tained from Ss both before and after ad- 
ministration of field problems; they were 
obtained as part of an interview and as part 
of a questionnaire; and different scoring 
procedures were employed. Statistical sup- 
port for the hypotheses was obtained in 
Studies A and C, regardless of the particular 
method used. The amount of indulgence 
men had received from their leader was in- 
ferred from their perception and statement 
of their leader's behaviors. The intercorre- 
lation between group PS and RD scores was 
marked in Study A, but smaller in Study D, 
although still significant. PS items repre- 
sented universal needs for these populations. 
However, in measuring RD the data re- 
vealed in Study A that Ss differed in role 
behaviors desired of SLs. Also, Ss per- 
ceived considerable variability in the be- 
haviors of different squad leaders. On the 
other hand, members of effective squads did 
not differ from members of ineffective 
squads in the desired behavior or in the 
perception of leader behavior. 

The findings in Studies A and C for PS 
and RD and group performance were not 
replicated in Study D. Indeed, there was 
a tendency for those ACs perceived as prob- 
lem solvers and meeting role expectations 
to have crews that performed more poorly 
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than others. The failure to obtain significant 
results in the predicted direction may be 
attributed to two major reasons: the lack of 
a legitimate basis for most crew members 
to respond to the items on PS and RD, and 
the motivational context within which the 
research was conducted. 

First, in the Air Force study most Ss 
had never worked together before entering 
the survival course. Moreover, because of 
the need to separate some 12-man school 
crews into two 6-man crews for purposes of 
testing on the CREWSCAT, 10 of the 60 
crews tested were led by leaders who were 
not even ACs of their crews during the 
school's course. Even for those test crews 
with the same AC as the school had desig- 
nated, many men found the items asked 
specific questions on the leader-follower re- 
lationship about which they had had little, 
if any, experience. Unfortunately, the in- 
vestigator encouraged the men to respond 
to the items on the assumption that where 
there was a lack of knowledge the responses 
would be randomly distributed. 

Second, during the period of this study 
Air Force personnel had a generally anti- 
pathetic attitude toward the survival course 
(fully understood by the investigator who 
had completed the survival course that previ- 
ous summer). Personnel resented anyone 
who tried to get them to "put out"; the 
course was demanding and arduous. In ad- 
dition, the leaders in the Air Force study 
as compared with the leaders in the Army 
studies did not have their group's perform- 
ance reflect as directly on them. The ACs 
were largely responsible for the conduct of 
their crew members. However, a leader in 
a situation such as this cannot escape in- 
volvement in his group's performance. It is 
simply suggested that effective performance 
by crew members on the CREWSCAT did 
not provide the degree of indulgence for 
their leaders as the comparable situation in 
the Army studies. 

Under these circumstances it would not 
seem surprising to find a tendency for men 
to accept sociometrically leaders who placed 
fewer task demands on them. Indeed, there 
was a consistent tendency for the sociometric 
response for the AC to be negatively cor- 


related with CREWSCAT performance. Tt 
cost the men little to make the leader look 
good through giving him positive ratings 
on a paper-and-pencil questionnaire. Re- 
ciprocality of indulgence could have operated 
in a different manner than explicitly studied 
here. If the men indulged the leader through 
rating him favorably on the sociometric 
measures, then it might follow that the same 
situation would occur when men responded 
to PS and RD items in a situation in which 
they had little, if any, concrete data to use in 
giving an objective response. For this rea- 
son, it may be worthwhile to examine the 
possible determinants of sociometric re- 
sponses in this context. 

The sociometric response can be viewed as 
a prediction by Individual B of the need 
satisfaction he would derive from Individual 
A under a specified circumstance or type of 
situation. B’s prediction of his satisfaction 
with A may be based on relevant past ex- 
periences that B has had with A. In such 
instances, A may have provided indulgences 
for B, and B seeks this relationship again. 
Here, one might expect to have a legitimate 
test of the theory of the reciprocality of in- 
dulgences. On the other hand, B may choose 
A on the basis of perceiving in A some 
characteristic which from previous experi- 
ence with other people leads him to expect 
need satisfaction from A. However, B is not 
indebted to A, hence there may be no ех- 
change of indulgence from A. 

Masling et al. (1955) have contended, on 
the basis of the analysis of these data and 
other data in a cross-validating study with 
a Navy population, that sociometric choices 
given to a person in the structured leader- 
ship position reflect, in part, choices given to 
the status position rather than to the par- 
ticular individual. Such choices do not per- 
mit one to infer a feeling of indebtedness or 
obligation on the part of the respondent to 
the leader. Moreover, the investigator has 
observed that military personnel sometimes 
resent giving sociometric responses ап 
when pressed to do so tend to give favorable 
responses. 

В may reject A sociometrically because 
the latter failed to provide айя 
dulgence or actually caused B a deprivation. 


В with an it- „3 


LEADER INDULGENCE AND GROUP PERFORMANCE 21 


However, B's previous knowledge of A's 
performance may lead B to express his 
evaluation of A's competence through a posi- 
tive sociometric expression. Also, then, 
these sociometric responses do not neces- 
sarily represent an indebtedness based on a 
previous indulgence. The assumption can- 
not be made, then, that B's sociometric re- 
sponse necessarily reflects a measure of past 
indulgences or lack of indulgences from A. 
Evidence that the sociometric technique is 
an ambiguous measure of prior indulgence 
has been given in an experiment conducted 
by Smith (1959). In a laboratory setting, 
using Air Force personnel drawn from the 
same population but not the same Ss as 
those in Study D, Smith employed the ex- 
pression by his Ss of the desire to remain 
in a crew as a measure of cohesiveness. He 
found no relationship between cohesiveness 
and the willingness to sacrifice the attain- 
ment of one's own goal for the group goal. 
In Studies B and D marked correlations 
Were obtained between scores on the PS and 
RD measures and sociometric responses. 
While scores on measures of PS and RD 
related significantly with group performance 
in Studies A and C, but not Study D, socio- 
metric choices for the leader failed generally 
to relate significantly with performance in 
Studies A, B, C, and D. In Study D the 
Correlations present were even negative. 
Nevertheless, one of the 10 sociometric items 
in Study B elicited choices from squad men 
for leaders which correlated with statistical 
Significance with performance. The item 
Concerned the confidence a squad man would 
Place in his SL’s covering him if he led an 
advance into an enemy town. Perhaps 
Choices for the leader in this situation re- 
flected а rating on an SL's competence. 
However, neither the summated choices nor 
rejections for all items for the SL in Study 
B independently correlated with statistical 
Significance with criterion RSFP scores. 
Studies B, C, and D yielded positive cor- 
relations between the leaders’ sociometric 
choices of their men and group perform- 
ance. Even though these correlations 
Teached statistical significance only in Study 
- >, the consistency of their direction in the 
three studies warrants attention. The most 


reasonable interpretation seems to be that 
а leader's responses represented largely 
ratings on the competence of his men. It 
would appear doubtful that much, if any, of 
the variability in group performance scores 
could be attributed to the leader's increased 
efforts on the field problems as a conse- 
quence of the indulgences his men had given 
him in the past. 

Data in these studies indicated that 
although the reasons for responses to PS 
and RD items and sociometric measures may 
overlap for group members, the former two 
measures did tap, in addition, that type of 
indulgence which differentiates effective 
from ineffective groups in Studies A and С. 
However, in Study D where men generally 
lacked the experience to respond to PS and 
RD items, scores followed sociometric eval- 
uations for the AC in tending to correlate 
negatively with CREWSCAT performance. 

Scores on the three sociometric items, PS 
and RD in Study D all correlated negatively 
with performance, albeit without statistical 
significance. The results did suggest some 
tendency for rejection of a leader who de- 
manded effective performance. In Study B 
sociometric rejections for the leader tended 
to correlate negatively with squad perform- 
ance. Consequently, when simply the num- 
ber of sociometric responses, regardless of 
whether they were choices or rejections, re- 
ceived by the SL was correlated with RSFP 
scores a statistically significant positive re- 
lationship was found. 

Havron, Greer, and Galanter (1952) 
tested the hypothesis that there would be a 
positive relationship between the SL as a 
taskmaster and squad performance. It was 
difficult to devise questions in this area that 
the respondent might not perceive and, con- 
sequently, answer in terms of PS items. 
The hypothesis was not confirmed through 
the overall score derived from the five items 
employed. However, the item which bore 
the greatest face validity to what was sup- 
posed to be measured yielded responses 
which differentiated with statistical signifi- 
cance between effective and ineffective 
squads in the direction predicted. More 
members of effective squads said their SL 
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kept after the men to make them do their 
best. 

A leader may often have to motivate some 
men against their wishes and, hence, they 
may reject him on sociometric-type items. 
As previously mentioned, it may be that 
some men may choose the SL sociometrically 
if he does not "push" them. Then, the 
sociometric-type response can be a measure 
of the lack of past deprivation, and ap- 
parently not indicative, or negatively so, 
of effective performance on a field problem. 

The sociometric, PS, RD, and ADA 
measures can be placed on a continuum in 
that order of value in indicating the type of 
indulgence that related to effective perform- 
ance in the situations in this series of re- 
search studies. Generally, as one goes from 
the sociometric to the ADA measure the 
items become more specific and place greater 
constraints on the S for responding ob- 
jectively and concretely. For the first three 
measures one is, however, ultimately de- 
pendent on the S’s willingness and ability to 
depict accurately the nature of various rela- 
tionships with the leader. In responding to 
the authoritarian measure the S had no 
idea how his scores would be used; the 
nature and amount of indulgence were in- 
ferred. 

In Study D where many men had had 
little, if any, experience with their leaders, 
responses to PS and RD items appeared to 
be given on a basis similar to sociometric 
responses. Although both PS and RD scores 
correlated significantly in Study C with 
squad performance, the former correlation 
was smaller than the latter. In responding 
to RD items, as data in Study A suggested, 
Ss may have been less able to identify the 
more socially acceptable or desirable role be- 
haviors and, hence, responded more directly 
or objectively (Edwards, 1959). 

One could conclude that groups where 
leaders are perceived by their men as prob- 
lem solvers are prone to be both effective and 
ineffective and that groups where the leaders 
are not perceived as problem solvers are apt 
to be more consistently less effective. Wil- 
son, Beam, and Comrey (1953) related the 
characteristics of supervisors to group effec- 
tiveness through determining by ratings 


those workshops which were high, low, and 
in the middle on effectiveness. These inves- 
tigators found that supervisors of both the 
high and low effective groups as compared 
with supervisors of the middle groups had 
attitudes of greater sympathy and helpful- 
ness toward subordinates. Actually, in Study 
A there may have been a greater variability 
in problem solver scores for the SL of in- 
effective squads. Moreover, in Study D a 
statistically significant greater variance in 
criterion scores was found among a group 
of crews high as compared with a group low 
in their problem solver scores for their 
leaders. 


Authoritarian Attitude Discrepancy 


A number of studies have focused on the 
possible differential effect of the use of au- 
thoritarian and democratic principles of 
organization on group performance. This 
interest has stemmed, at least partially, from 
concern with the political structures of large 
societies. Clear-cut, repeatable results from 
these “miniature” studies are still wanting. 
The problem appears quite complex and con- 
clusions may be reached only for specific 
situations. Moreover, to generalize from 
research on small groups to large societies 
seems fraught with difficulties. Societies are 
comprised of various sizes and types of in- 
terrelated groups and the measure of an 
effective society may be debatable. Even in 
the study of the effects of different political 
principles on small group behavior one 
should not overlook the influence of the per- 
sonality of group members and the nature 
of the group task. 

Haythorn et al. (19562) reported greater 
effectiveness in performance for groups 
composed of equalitarians or low scorers ОП 
the F Scale (Adorno et al., 1950). However, 
the task required the cooperation of group 
members and dealt with problems in human 
relations. Consequently, the generalizability 
of this finding to task situations of a differ- 
ent nature might be questionable. Haythorn, 
Couch, Haefner, Langham, and Carter 
(1956b) failed to demonstrate that the more 
effective groups were those where the leader 
and men were homogeneous, that is, the 
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leader and men were both equalitarian or 
the leader and men were both authoritarian. 
Their groups viewed a human relations 
problem on film and then discussed and de- 
termined what they felt should have been 
the proper dialogue. Observers rated the 
homogeneous groups as more cooperative, 
with better communication and higher 
morale than the heterogeneous groups. How- 
ever, the data were equivocal on whether 
homogeneous groups were more effective 
than heterogeneous groups. Leaders may 
have lacked personal concern over group 
performance. In addition, the student leader 
and the other students may not have had 
sufficient previous association to develop a 
basis for the reciprocality of indulgences. 
Perhaps the most important reason for the 
equivocal findings was the need for groups 
to have many kinds of solutions available 
for specific problems. 

Heterogeneity may be more important for 
effective group performance when the group 
task requires the consideration of a variety 
of possible solutions. Hoffman (1959) re- 
ported that he found that groups with mem- 
bers having different personalities performed 
more effectively. Such groups had more pos- 
sible solutions available, increasing the like- 
lihood of the eventual employment of the 
best solution. However, the heterogeneous 
BEroups were more likely to have problems of 
affect. The results of the Hoffman (1959) 
study, then, lend some insight into a possible 
Teason to explain why results of the Hay- 
thorn et al. (1956b) study were not con- 
sistent with the present findings. 

Groups in this series of research studies 
Were faced with few truly problem solving 
Situations. Group performance evaluations 
Were based largely on group members' re- 
Sponses to situations for which they knew 
Well what was appropriate from previous 
training experiences. Most points on the 
field problems, and especially the Army 
RSFPs, simply depended on the willingness 

_ of men to expend their energies. 

The studies presented here involved 
groups whose social patterns were ideally 
authoritarian in nature because of the re- 
quirements of the larger groups of which 

€y were a part. It is well to note in passing 


that the results between personality charac- 
teristics and group performance were ob- 
tained with groups whose social systems 
were largely authoritarian. In some groups, 
then, the external values were at variance 
with the personality values of the leader 
and men. Blau (1960) has contended that 
"structural effects" on behavior can be 
separated from the effects of personality on 
interpersonal relationships; the present re- 
search would appear to support this posi- 
tion. 

Studies B and D offered no data to con- 
clude that either the authoritarianism of the 
leader or men related directly to perform- 
ance on field problems, Moreover, the 
leader’s authoritarianism did not correlate 
with that of his men. The heterogeneity on 
authoritarianism of all men in a group in 
Study B did correlate negatively with per- 
formance; although the correlation was in 
the same direction in Study D, it was not 
significant. For both Studies B and D the 
groups were separated into thirds according 
to total group mean on authoritarian scores, 
Six individual correlations were determined 
between group heterogeneity and perform- 
ance. No consistent pattern of correlations 
occurred with either study or between 
studies. F Scale scores were reported in 
the literature to be negatively correlated 
with intelligence (Titus & Hollander, 
1957). Consequently, there would have 
been some basis for expecting the authori- 
tarianism of the leader or men to be 
negatively correlated with performance. For 
the groups in Study A, Havron, Greer, and 
Galanter (1952) reported SLs of effective 
squads were more intelligent than leaders 
of ineffective squads; there was no differ- 
ence between men of the effective and in- 
effective squads on intelligence. In Study 
D there was no relationship between a meas- 
ure of intelligence and the authoritarianism 
of leaders. Moreover, a negative correla- 
tion approached significance between scores 
on a measure of a leader's intelligence and 
CREWSCAT scores (Greer et al., 1957a, 
1957c). Incidentally, the umpire's rating 
of the cooperation (independent from 
CREWSCAT score) within the group dur- 
ing the field problem yielded a significant 
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negative correlation with scores on the 
measure of the leader's intelligence, that is, 
the more intelligent the AC, the less judged 
cooperation in the crew. 

Different instruments were employed to 
measure the authoritarian-equalitarian atti- 
tudinal syndrome in Studies B and D. Both 
indexes omitted, however, items on projec- 
tivity, sex, and superstition. In responding 
to the EI Ss selected the best statement 
representing their position, thus avoiding 
the possible problem of a set to agree with 
an item. Nevertheless, significant results 
were obtained for the theory of the re- 
ciprocality of indulgences for each measure 
of authoritarianism. In completing the 
authoritarian measures the S's had no way to 
tell how the scores would be used; therefore, 
these responses did not reflect any biasing 
influence as a consequence of such knowl- 
edge, Moreover, as a possible measure of 
indulgence, authoritarianism was not re- 
stricted to A and B having previously had 
a history of leader-follower relationships; 
authoritarianism relates to a wide range of 
attitudes and interpersonal behaviors. 

The measures of authoritarianism used in 
these studies do not yield a separate score 
on authoritarianism and equalitarianism. 
The lack of an authoritarian response in 
relevant situations suggests that other kinds 
of responses must be made. A number of 
these other responses can be classified as 
equalitarian. Therefore, the question is 
raised of whether one might not simultane- 
ously be to different degrees authoritarian 
and equalitarian. Perhaps they have a low 
positive correlation. Nevertheless, the meas- 
uring procedure confounds these two pos- 
sible variables. Consequently, the ADA 
scores may also represent a confounding of 
two different influences. 

The ADA hypothesis was found tenable 
when tested with rifle squads on a field prob- 
lem and for aircrews going through a sur- 
vival and evasion field problem. Moreover, 
in both Studies B and D it made no differ- 
ence for the tenability of the hypothesis 
whether the leader was more or less au- 
thoritarian than the group. In addition, the 
data indicated that the hypothesis was ten- 
able for all positions on the authoritarian 


continuum and not just the extremes. 
Simple discrepancy was the important mat- 
ter. In these studies the measure of ADA 
was generally the difference between the 
leader’s score and the mean for the men in 
his group. Another possible index of this 
discrepancy was the sum of the differences 
between each man’s score in the group and 
the leader’s score. This latter procedure was 
also used in both Studies B and D and 
negative correlations remained statistically 
significant. Nevertheless, in each instance in 
which the summation of individual differ- 
ences was used a correlation was obtained 
a few points lower than when the dis- 
crepancy was based on the group mean. 
One might expect the summation method 
to yield a more sensitive measure of leader- 
men discrepancy, since the same mean may 
represent different distributions of scores. 

However, it may be that beyond a cer- 
tain point in the discrepancy between In- 
dividuals A and B in authoritarianism 
there is no different effect in terms of be- 
havior. Any such psychological insignificant 
difference would still be reflected in the 
statistics and, hence, adversely influence the 
degree of the correlation. The use of the 
mean score for the men may mask this 
effect and yield a higher correlation. This 
conjecture seems provocative and may de- 
serve further research effort. On the other 
hand, the mean may simply be a more re- 
liable index than the summation of in- 
dividual differences. 

A logical conclusion would be that in the 
absence of knowledge about the authori- 
tarianism present in a group of men one 
increases the chances of effective perform- 
ance by appointing a leader for the group 
who scores in the middle of the authori- 
tarian distribution; on a probability basis 
the attitude discrepancy should be the 
smallest. Adams’ (1954) findings and the 
statistical analyses in Studies B and D indi- 
cated that the midscoring leader on authori- 
tarianism was, indeed, more liable to have 
effective groups than extreme scoring 
leaders. These empirical relationships raised 
the possibility that the ADA correlations 
were actually due to midscoring leaders 
simply being more effective leaders. This 


{situation could account for the relationship 
М. between small ADA scores and effective 
performance. 
7 An individual who is neither authori- 
_ tarian nor equalitarian may conceivably, 
other conditions equal, have available and be 
| more likely to employ those solutions or be- 
— haviors which are most appropriate to the 
— situation. The midscorer may be character- 
‘ized by a greater repertoire of responses and 
а greater willingness to use any specific 
_ fesponses to meet leadership problems. 
Courtney, Greer, and Masling (1952) 
found both types of extreme scorers to be 
more consistent or rigid in the expression of 
their attitudes. Moreover, if heterogeneity 
in group composition can lead to effective 
performance in some situations, hetero- 
geneity in personality may also under cer- 
tain circumstances be an asset for individual 
_ effectiveness. However, the key is probably 
— in the nature of the criterion of performance 
er task. Some problems may be more amen- 
able to solution through a uniform type of 
approach and others by a mixed approach. 
Problem solving in the field problems was 
of minor significance, since the appropriate 
behaviors were generally fairly well known. 
-In addition, the nature of the problem did 
not relate directly to authoritarianism or 
- equalitarianism. 
_ Further analyses of data in Studies B and 
-D indicated that the discrepancy between the 
Header and men on authoritarianism con- 
tributed to the empirical relationship, at 
least to a major extent, independently of a 
direct personality determinant of the leader. 
Moreover, it was ascertained that the ADA 
"relation with performance could not be 
BE to an artifactual relationship with 
authoritarian scores for the men in a 
group, "This research has by no means ruled 
Ош the possible importance in some situa- 
j tions Of the authoritarian or equalitarian 
lity in directly determining effective 
vidual and group performance. 
Alth igh the artifactual basis of the ADA 
tion with performance was not 
nstrated, there still remains a question 
"actual causal determinant ог deter- 
S of the correlation. Facile com- 
cation, harmony, and predictability of 


LEADER INDULGENCE AND GROUP PERFORMANCE 


25 


each other's behavior may often be asso- 
ciated with effective group performance. 
Under these circumstances, group members 
would be less likely to have their energies 
bled off into intrarelational problems and 
could, consequently, direct more of their 
concern and efforts to the group task. The 
importance of understanding communica- 
tions seems beyond cavil. Moreover, bicker- 
ing and conflicts among group members 
could detract from group performance. 
Havron, Greer, and Galanter (1952) found 
effective performance associated with groups 
whose members perceived themselves as 
sharing the same interests. Cohesiveness and 
field performance were positively correlated 
for squads used in Study B (Greer, 1955). 
Where members had more accurate social 
perceptions of group relevant matters, they 
were found in several studies to perform 
more effectively (Greer, Galanter, & Nord- 
lie, 1954; Greer et al., 1957a; Havron et al., 
1954). Accurate social perceptions could lead 
group members to act more confidently and 
hence more effectively on group tasks, In 
one study (Greer et al., 1954) it was stated, 
regardless of the determinants, effective 
SLs tended to have more accurate social 
perceptions than the ineffective leaders and 
the most popular individuals were more 
accurate than the less popular. 

When the leader and men have similar 
attitudes, the amount of internal group con- 
flict may be reduced through their similar 
understanding and structuring of problem 
situations and agreement on solutions, Addi- 
tionally, the fact that leader and men hold 
similar attitudes may lead them to believe 
subjectively they can more accurately pre- 
dict the behavior of others; and such be- 
havior may actually be more predictable 
than when leader and men differ in attitudes. 
Leader and men should, then, act with 
more confidence in the achievement of group 
tasks. These explanations could theoretically 
just as well account for the empirical rela- 
tionship between ADA and group perform- 
ance as the reciprocality of indulgences. 

Facile communication, harmony, and be- 
havioral predictability as contributors to 
group performance would not appear in this 
context as necessarily part of the indulgence 
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rubric. Furthermore, these possible facili- 
tators of effective performance do not have 
to stem alone from the similarity of atti- 
tudes on authoritarianism. On the other 
hand, the reciprocality of indulgence in- 
volved in the similarity of authoritarianism 
between leader and men stresses, at least, 
in part, the importance of the leader's be- 
havior and the extent to which this behavior 
satisfies the men. However, the leader's 
behavior prior to the problem itself may 
have served also as a basis for the men to 
infer some of the basic personality struc- 
tures or values of the leader. The ability of 
men to identify with the leader and for them 
to feel he identifies with them may be an 
important aspect of the reciprocality of in- 
dulgence as it relates to similarity in au- 
thoritarianism. 

An examination of the nature of the cri- 
terion problems suggested that the impor- 
tance of communication between leader and 
men was quite minimal. Most interaction, 
especially in the Army studies, was in terms 
of gestures and short commands or replies. 
Since the field problem scores reflected few 
real problem solving or decision making sit- 
uations, the harmony in a group would not 
seem to have been particularly critical in 
directly influencing group effectiveness. 
Other data of Studies C (Havron et al., 
1954) and D (Greer et al., 1957a, 1957b) 
provided no evidence that cohesiveness or 
harmony as measured among group mem- 
bers was necessary for effective perform- 
ance. Of the three, the ability of group 
members to predict one another's behavior 
or attitudes most consistently correlated with 
effective performance on the field problems. 
Results for the original research for Studies 
A (Greer et al., 1954), C (Havron et al., 
1954), and D (Greer et al., 1957a, 1957b) 
showed a significant positive relationship 
between the ability of group members to 
predict accurately some aspect of the group's 
behavior and effective performance. The 
behaviors predicted by group members in 
these studies were generally not related di- 
rectly to the behavioral requirements of the 
criterion problems. 

The nature of the criteria appears to 
minimize the probability that the empirical 


relationship stems largely from the opera- 
tion of variables other than the reciprocality 
of indulgences. Nevertheless, the results of 
some relevant statistical analyses can be ex- 
amined. If the ADA correlations were the 
consequence of the facilitation of communi- 
cation between leader and men, harmony 
between leader and men, and/or the ability 
of leader and men to predict each other's 
behaviors, then the ADA correlations would 
seem likely to follow the same pattern as 
correlations obtained between total group 
heterogeneity on authoritarianism and group 
performance. Indeed, although the dyadic 
relationship between the leader and each 
group member may be the most significant of 
the two-person relationships in the group, 
the group heterogeneity correlations on au- 
thoritarianism, reflecting the effects across 
all relationships in the group, might be ex- 
pected not only to follow the same pattern 
as the ADA correlations but might also be 
expected to be even greater in size. In addi- 
tion, it seems reasonable to assume that, 
other conditions constant, ease of communi- 
cation, harmony, and predictability of be- 
havior in groups would naturally increase 
with a group's age. Then, regardless of 
whether the leader and men or total group 
is considered, the importance of similarity 
on authoritarianism to group effectiveness 
could be expected to diminish or be masked 
as the period of group interaction increased. 
On the other hand, the size of the АРА 
correlation. might be expected to increase 
with a group's age, at least for some period 
of time, as there were more and more oppor- 
tunities for the men to receive or not receive 
the relevant indulgences from the leader. 

The significant correlation for group 
heterogeneity on authoritarianism and group 
performance in Study B was not replicated 
in Study D, although the ADA was signifi- 
cant in both studies. The heterogeneity cor- 
relations for the total group did not follow | 
the pattern of the ADA. In Study B group 
heterogeneity on authoritarianism correlated 
significantly with performance for only the 
youngest group of squads; ADA correlated 
significantly with the field problem for only 
the oldest group of squads. Consequently, 1t 
seems questionable to attribute the 518117 
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ficant correlations obtained to the same 
underlying causes. Over time the leader 
is apt to have provided more or relatively 
fewer indulgences to his men and, therefore, 
the larger correlations for the older groups. 
Additionally, if one contends that it takes 
group members time to learn each other's 
attitudes, then the heterogeneity correlations 
should seemingly have been significant and 
markedly larger for the older groups. The 
data indicated that the ADA correlations did 
not derive from a correlation between group 
heterogeneity on authoritarianism and per- 
formance. 

For Ss in Study B the length of time in 
the same squad correlated significantly with 
the criterion (Greer, 1955). However, for 
Ss in Study D there was no statistically sig- 
nificant association between a group's in- 
tegrality and performance (Greer et al., 
1957a, 1957b), suggesting, perhaps, a lack 
of variability in the variable of time together. 
The correlation between authoritarian leader- 
men discrepancy and group performance in- 
creased over time in both Studies B and D 
with the increase most marked in Study B. 
Indeed, for that group of squads in Study 
B where men had been together the longest 
ADA correlated with the criterion —.48. 
The size of the ADA correlation, and the 
fact that it was obtained in that group of 
Squads where the men had been together 
the longest, can be interpreted to indicate 
the independent importance of the recipro- 
сау of indulgence as a major contributing 
factor to the obtained empirical relation- 

ship. 

The evidence seems to indicate that the 

- АРА correlations are largely а function of 
the reciprocality of indulgences. Indeed, 
the nature of the criteria and the data appear 
fo give little, if any, support to the other 
Possible explanations for the empirical rela- 
tionship. Moreover, the results appeared to 

Indicate that ADA is an effective deter- 

minant of group performance regardless of 

Whether the leader-men difference on author- 

itarianism occurs at either extreme of the 

Continuum or in the middle. Generally, one 

thinks of authoritarian or equalitarian be- 

àvior. Therefore, the effectiveness of ADA 
t the extremes of the authoritarian con- 


tinuum seems reasonable. Perhaps ADA’s 
effectiveness in the middle of the continuum 
can be attributed simply to a satisfaction or 
dissatisfaction with the least amount of con- 
sistency in authoritarian or equalitarian 
manifestations. On the other hand, it may 
be that there are some qualitatively different 
behaviors for the midscorers just as there is 
between authoritarianism and equalitarian- 
ism. Sanford (1950) and Courtney et al. 
(1952) reported on the results of adminis- 
tering both the A-E scale and open-ended 
questions to a number of Ss. The data did 
provide some statistically significant evi- 
dence that there are qualitatively different 
responses for midscorers, authoritarians, 
and equalitarians. 

The failure of the PS and RD measures 
to correlate with CREWSCAT performance 
in Study D points up the apparent differ- 
ences in types of indulgences. These two 
measures were significantly related to per- 
formance in two previous studies, including 
Study A where the ADA measure also 
yielded a significant relationship. However, 
only the attitude discrepancy between the 
leader and men correlated significantly with 
performance in Study D. These contrasting 
results for the different measures of indul- 
gences were not statistically inconsistent. 
The group scores on ADA in Study B did 
not correlate significantly with the group 
scores on PS and RD in Study A. The latter 
two measures did correlate significantly with 
each other. Moreover, whereas sociometric 
measures significantly related to PS and RD 
measures, there was no significant correla- 
tion with scores on the measure of ADA. 
These same significant and nonsignificant 
correlations were found again in Study D, 
even though the measures of the variables 
were not identical. 

Since both the RD and ADA measures 
relate to types of individual behaviors, one 
might have expected a significant relation- 
ship between them, at least for the earlier 
studies. However, the RD measure may 
have included items referring to specific 
behaviors overlapping little, if any, with the 
authoritarian continuum. Additionally, the 
data have indicated possible biasing in- 
fluences on the RD responses and, conse- 
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quently, scores on the RD measure may 
have represented less valid indications of 
the same type of indulgence as scores on the 
ADA measure. 

The abortive use of the PS and RD 
measures in Study D and the success with 
ADA can, perhaps, be explained in terms of 
the differential opportunities for indulgent 
experiences. Both Studies B and D provided 
evidence that the ADA correlations in- 
creased in size with the ages of the groups; 
the phenomenon was, however, more ap- 
parent in Study B than D. Therefore, one 
could surmise that generally the ages of the 
groups in Study D were not sufficient for 
the responses that men gave about their 
leaders on the PS and RD items to reflect 
very accurately the nature of the pertinent 
indulgences, if any, that the men had re- 
ceived. The fact that the men seldom had 
experience enough to make such specific 
statements has been described earlier. In 
Study D the men's responses to the A-E 
Scale should have been as reliable and valid 
as under any other condition; however, their 
answers to the PS and RD items were prob- 
ably not reliable and even less valid. The 
utility or effectiveness of the ADA measure 
depended not only on the validity of the A-E 
scores but, perhaps, with equal importance 
on the kind of experiences that group mem- 
bers had had with their leader on the 
CREWSCAT before and during the admin- 
istration of the field problem. 

Although the leader-follower relationship 
is an important facet of relevancy for the 
authoritarian continuum, this personality 
syndrome also relates clearly to a large 
number of other social behaviors and 
ideological matters. Consequently, indul- 
gences can be exchanged outside of the con- 
text of a leader-follower situation. It will 
be remembered that before the CREW- 
SCAT problem all men had received in- 
tensive training together back at Stead for 
7 days and arrived at the testing area after 
spending 4 or 5 additional days "surviving" 
together in tepee camps in the High Sierra. 
Based on this investigator's observations and 
personal experience as a trainee, this close 


and continuous association under difficult | 
circumstances led to all group members E 
having considerable interaction and knowl- | 
edge of one another. With this background 
experience and the interaction of the desig- 
nated leader and men on the CREWSCAT 
the ADA correlated significantly in the pre- 
dicted direction with group performance. 

The ADA correlation for the youngest 
group of squads in Study B was practically 
zero. Unlike the Air Force situation there 
was no certainty that all men in the group of 
youngest Army squads had had any prior 
experience with one another comparable to 
the 11 or 12 days Air Force personnel had 
with one another just prior to their field 
problem. Indeed, there was considerable 
doubt. It was common knowledge that in 
some cases, although the men in a squad on 
the problem came from the same platoon, 
they hardly knew one another. In order to 
fill its quota of squads for the researchers, 
a platoon on occasion would not have enough } 
regular or integral squads and, then, would | 
send "put together" squads comprised of 
men throughout the platoon. Moreover, the 
investigator, who participated in the rating 
of 12 or more squads on the RSFP, did not 
Observe as much opportunity as was the 
case in the Air Force study for the inter- 
action of leader and men and for the leader's 
behaviors to be perceived differently enough 
to indulge or not indulge according to the 
authoritarian continuum. These factors, 
then, may have accounted for the essentially 
zero correlation between ADA and perform- 
ance for the youngest group of squads in 
Study B. 

No evidence has been found to suggest 
that the significant ADA correlations were 
statistical artifacts of some other relation- 
ships. Moreover, the possibility of other 
theoretical explanations for the empirical 
relationships was explored; such examina- fi 
tions left the theory of the reciprocality of 
indulgences to appear the most legitimate 
explanation for the ADA correlations 
Finally, the plausibility has been estab- 
lished of the independent operation of 


as a measure of indulgence. These results 
strongly suggest that for work tasks com | 
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parable to those in this series of research 
studies the proper small group leader-men 
arrangement on authoritarianism might con- 
siderably increase production or effective- 
ness at no extra cost in total manpower. 
Moreover, in the absence of knowledge about 
the authoritarianism of group members the 
most effective leader, on a probability basis, 
will be the one who is a midscorer on the 
authoritarian continuum. 


Field vs. Laboratory Research 


The group problems in these studies were 
built to measure group competence on mili- 
tary tasks and to later serve training needs. 
Possible predictor variables did not influence 
the nature of these criterion problems. 
Consequently, it was not surprising that the 
correlations relevant to the theory of the 
reciprocality of indulgences were not any 
higher than .48. Under laboratory control 
the relationships might have been quite 
marked. Nevertheless, the theory of the re- 
ciprocality of indulgences appears to provide 
predictive utility even under these field con- 
ditions. 

Field research carried out in the military 
Cannot match the rigor of the experimental 
laboratory situation. Such field research 
courts the gamut of vicissitudes which tends 
to decrease the reliability of measurements 
of independent and dependent variables. 
Moreoyer, laboratory Ss are apt to be 
docile, sympathetic, intelligent, and often 
they can be rewarded; Ss in the military 
field situation can be quite antipathetic. The 
use of controls in the laboratory research 
Сап maximize the possibility of demonstrat- 
ing predicted relationships. Conversely, 
Some results in the laboratory may never 
be capable of demonstration in the "dy- 
. namic” or challenging field environment. 

Tn the laboratory setting one can generally 

be more certain that a relationship between 

two variables is cause and effect or func- 
tional in nature. However, with a sufficient 

Sample size in field research one can after 

the fact attempt to control for extrinsic 

Variables through statistical manipulations. 
К This approach was used in the several field 

Studies reported here. 


Theory of Reciprocality of Indulgences 


The dyadic theory presented was that 
when Individual A satisfies a need for In- 
dividual B, then Individual B tends to satisfy 
a need of Individual A. The satisfaction of 
a need by another is an indulgence. How- 
ever, indulgences may be given consciously 
or unconsciously, What is an indulgence 
may vary among people; its applicability 
may be quite narrow down to a specific in- 
dividual, or up to groups of individuals, or 
to individuals almost universally. A leader's 
fairness and objectivity can be an indulgence 
to all his men. On the other hand, a leader's 
political and religious beliefs may provide 
an indulgence for some of his men and not 
for others. The same applies for his educa- 
tional and social background. An individual 
may indulge others through his behaviors 
which stem from his general attitudes or 
values and/or through such values permit- 
ting others to identify with him and, thus, 
also deriving a feeling of acceptance. Several 
kinds of indulgences were involved in these 
studies. 

The empirical relationships in this re- 
search supported the theory of reciprocality 
of indulgences and were consistent. with 
Homans' (1950) concept of the "exchange 
of favors." He stated that there is a uni- 
versal norm in human behavior that when 
Individual A performs a favor for B, then 
B is expected to perform a favor for A. 
Torrance’s (Torrance & Mason, 1958) 
notion of "negative identification," however, 
can be used to predict the same empirical 
consequences, but the nature of motivations 
inferred differ. According to Torrance, 
when the leader's values differ from those 
of his men, the men are motivated to be- 
have negatively in terms of the leader's 
goals. For him negative identification occurs 
when A's background, personality, and/or 
values are so different from B's that the 
latter cannot identify with A. B, then, tends 
to disagree with and act in opposition to the 
opinion of A, regardless of the value or 
accuracy of A's statements. Hollander 
(1958) has presented a theory for explain- 
ing why one individual may still receive 
group acceptance and status although de- 
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viating from common group expectancies, 
while another individual deviating to no 
greater extent may have sanctions applied to 
him or even be expelled from the group. 
The first person, according to the theory, 
has accumulated a greater number of “posi- 
tively-disposed” impressions or “idiosyn- 
crasy credits” in perceptions of relevant 
others than the latter individual. The former 
individual finds himself in the favorable 
position through his task performance, gen- 
eralized characteristics, and fewer devia- 
tions from expectancies. Hollander’s theory 
is not necessarily inconsistent with those of 
Homans, Torrance, or the one presented 
here. However, the theory is more mani- 
festly consonant with that of Torrance in 
focusing on the source of deprivational be- 
havior in interpersonal relationships. 

The Torrance and Hollander theories, 
then, highlight the deprivational facet in in- 
terpersonal behavior. When Individual A 
does not satisfy a need for B or creates a 
dissatisfaction for B, then B does not satisfy 
a need for A or creates a dissatisfaction for 
A. Perhaps this is the “reciprocality of 
deprivations.” The leader’s failure to pro- 
vide an indulgence for his men may be a 
deprivation. The men may have reciprocated 
this deprivation by not carrying out their 
proper roles on the field problem. A theory 
of the reciprocality of deprivations, it might 
be argued, is just as explanatory of the 
empirical results as the theory of the re- 
ciprocality of indulgences. However, it 
cannot account for the source of motivation 
that causes the men to carry out their role 
expectations; a theory of reciprocal de- 
privations explains why men may not carry 
out their role expectations. It seems that in 
the present context as well as in many 
others both indulgence and deprivational 
theories may be needed to explain the vari- 
ous motivational and behavioral conse- 
quences. These two theories could be sub- 
sumed under a theory of the “reciprocality 
of kind"; Individual B's behavior in rela- 
tionship to Individual A's needs is going to 
be a function of A's satisfaction of B's 
needs. 

Any of the aforementioned theories, it 
seems, can have a much more comprehen- 


sive application than just to interpersonal 
relationships. Individual A, who has in- 
dulged or deprived Individual B, may repre- 
sent a group of individuals or values of a 
social system embodied in one or more in- 
dividuals. B, who may actually stand for 
more than one individual as in these studies, 
will react in accordance with his needs to 
the effect of A. Groups in this research can 
be conceived as representing small social 
systems with expected role behaviors. The 
degree to which the social system functioned 
properly was a function of A's need satis- 
faction of his men. The knowledge of the 
determining variables in interpersonal rela- 
tionships may provide a basis for under- 
standing some phenomena on another level 
of analysis. 

Gouldner (1960) has discussed the “norm 
of reciprocity" and its implications for the 
stability of social structures. He believes 
that the motive for repayment of “gratifica- 
tions" becomes internalized such that there 
is a norm that obliges one to repay an in- 
dulgence from another. This aspect of the 
reciprocality of indulgences would seem to 
bypass those elements of the leader-follower 
relationship where the focus is on the power 
of the leader over his people and the 
coercion he can employ to control them 
(Bass, 1960). The position here does not 
necessarily discount the importance ОЁ 
power and coercion in leader-follower rela- 
tions, but rather it is suggested that these 
factors, perhaps part of the reciprocality of 
deprivations, cannot account for all of the 
motivations in interpersonal relationships, 
and, hence, in the present research it is not 
sufficient to explain group effectiveness. 
People may have a model, image, self-con- 
cept, or norm for themselves derived, in 
part, from the never ending socialization 
process. The extent to which their behavior 
in reciprocating indulgences adheres to their 
ideals may provide self-indulgence ог de- 
privation. i 

Present research results supported a post 
tion of the independent importance to group 
effectiveness of that aspect of the leader- 
follower relationship that does not require 
factors of power and coercion. The nature 
of the group problem was such that leaders 
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generally had little, if any, knowledge of 
| the performance of individual men. There- 

fore, the leader had practically no basis to 

hold individual men responsible for effective 
or ineffective group performance. Society's 
situation would appear to be often com- 
parable. Moreover, since the leader's in- 
dulgences to his men included his role be- 
haviors and personality manifestations which 
continue to be displayed regardless of group 
performance, the group members were in 
little, if any, danger of losing these indul- 
gences even if they performed poorly. 
| In testing hypotheses based on the theory 
of the reciprocality of indulgences in this 
research, two assumptions were involved. 
| First, it was assumed that the men on the 
| field problem could in fact provide indul- 
gences for their leaders through appropriate 
performances. However, the indebtedness 
of B to A does not necessarily imply that 

B can always satisfy the debt. For instance, 
» on the field problems there were some situa- 
tions where scoring depended upon sheer 
technical knowledge. Regardless of a man's 
desire to indulge the leader, if he did not 
possess the prerequisite knowledge he could 
not act properly. Nevertheless, in both the 
Army and Air Force situations most of the 
Scoring reflected the simple willingness of 
the men to comply with the leader's orders 
and expend the appropriate energy. 

Second, the leader must want to have his 
group perform well. Through observations 
and the context in which the tests were made 
it seemed reasonable to assume leaders gen- 
erally wanted to do well and their men were 
aware of this desire. Nevertheless, the 
leader's need in this area is a variable. Per- 
Sonal impressions and some analyses of the 
data suggested that as а group Army leaders 

Were more desirious of their groups' per- 
$ forming effectively as compared with the 
- Air Force leaders. Although each AC on 
_ the CREWSCAT was told he could come 
to the research office at the end of the school 
. Course to determine his crew's performance, 
no more than a handful ever took this op- 

Portunity. It is not suggested that ACs had 
По involvement in the performance of their 
_ Crews, but rather that they may have had 


l 


less involvement in the effective performance 
of their groups than they would have had 
in a different context, with other group 
members, and a different task. However, in 
both Studies B and D there was some indi- 
cation that the size of the ADA correlations 
was positively related to the degree to which 
the leader was anxious for his group to per- 
form well on the field problem. Conse- 
quently, in testing the hypotheses in this 
research it was important to be able to 
assume that group members were capable 
of performing well and that such perform- 
ance would actually be an indulgence for 
leaders. 

Additional research could shed valuable 
light on various facets of the different re- 
ciprocalities. For instance, under the rubric 
of the reciprocality of kind it may be pos- 
sible to relate some personality variables to 
whether one is more prone to be influenced 
in behavior by moral obligation, a possible 
phase of the reciprocality of indulgences, or 
revenge, a general aspect of the reciprocality 
of deprivations. The absence of an indul- 
gence may be the same as a deprivation to 
some but not to others. Again, some per- 
sonalities may be more oriented to repaying 
indulgences than building up credits which 
they can later cash in. In fact, some people 
resent being indebted to others. Moreover, 
some individuals may be more apt to recipro- 
cate indulgences and deprivations of the very 
same nature rather than some other areas. 

Greer et al. (1954) presented data sug- 
gesting leaders of more effective squads 
have more acctrate social perceptions con- 
cerning their group members than leaders of 
less effective squads. Accurate social per- 
ception may be associated with the ability 
to judge the needs and means of satisfaction 
for others, and, consequently, make more 
certain the operation of the reciprocality of 
indulgences. Another research problem re- 
lates to the possibility that the reciprocality 
of indulgences may not be a simple quid 
pro quo between individuals. Indeed, em- 
phasis on conscious bargaining procedures 
may often be detrimental to the process. 
In addition, beyond a certain point it may 
be of no avail for Individual A to indulge 
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further Individual B; some of the present 
data suggested this possibility. Again, the 
unit of indulgence for one person may differ 
from others. Another possibility for study 
is that B may indulge A only under condi- 
tions in which B believes A. has a legitimate 
right to have the specific need. Experimental 
and field research are required to substan- 
tiate further present findings and to pin 
down more definitively the situations and 
conditions under which the various recipro- 
cality theories are relevant and explanatory. 

The effective leader, then, is probably one 
who is sensitive to his group members' 
needs and to his ability to satisfy these 
needs. Furthermore, he clearly delineates 
his expectations in regard to group tasks 
and he is cognizant of group member capa- 
bility. The effective leader balances as 
appropriate and possible indulgences and 
deprivations for his followers. In those 
cases where group members are heterogene- 
ous in certain needs or the leader naturally 
deviates generally from group member 
values the effective leader will be one who 
cloaks true feelings and assumes neutrality 
or the midposition on the continuum. 
Platitudinous but true, the effective leader 
may have to represent different things to 
different group members. On the other 
hand, although certain leadership character- 
istics may be universally desirable, the 
nature of group members and tasks faced 
by the group often require quite varied 
leadership characteristics. Unfortunately, 
there must be a limit to a leader's ability 
to assume different forms and characteris- 
tics, and especially when there are simul- 
taneous requirements. Therefore, an in- 
dividual's potential as an effective leader 
may vary considerably with the nature of 
the group members and the kind of group 
task. 

The dyadic theory that “when Individual 
A satisfies a need of Individual B, then 
Individual B tends to satisfy a need of In- 
dividual A" now deserves further qualifac- 
tions or emphases. Individual A must 
actually have a relevant need; B must be 
capable of indulging this need. The extent 
to which B indulges A's need appears to be 


a function of the intensity of that need. . 
Moreover, B may indulge A only in those ` 
areas where B feels A has a legitimate right 
to expect indulgence. Finally, B can re- 
ciprocate A's indulgence to him only as a 
function of the opportunities afforded. 


SuMMARY 


In a series of four field research studies 
involving 272 Army infantry rifle squads 
and Air Force crews several specific hypo- 
theses were tested. These hypotheses were 
derived from the theory of the reciprocality 
of indulgences. The theory indicates that 
the more a leader satisfies the needs of his 
men the harder they will work for him on 
a field problem and, therefore, the better 
the group will perform. The independent 
variables used were the perception by group 
members of the leader as a problem solver, 
the extent to which the leader was perceived 
as meeting the role expectations they had for 
an ideal leader, and the similarity between 
leader and men on authoritarianism. The 
dependent or criterion variables were group 
performance scores on realistic field prob- 
lems. Umpires rated specific group be- 
haviors to test situations which covered 6- | 
hour periods. 

Each of the three hypotheses, that is, 
problem solver, role discrepacy, and authori- 
tarian similarity received statistically sig- 
nificant support in initial studies and was 
then replicated in a second study. However, 
in testing the first two hypotheses for the 
third time, inconclusive results were ob- 
tained. Analysis of the situation suggested 
that the subjects had been requested to 
supply answers to items for which they had 
little, if any, basis for responding. Field 
observations and various additional analyses 
of the data indicated these men, then, re- 
sponded on the basis of whether they simply 
liked their leader. In this particular study 
and in the others there was generally no 
significant positive relationship between the 
sociometric choices by the men for their 
leader and group performance. 

Through the use of the interpersonal 4 
theory of the reciprocality of indulgences it 


as possible to account for some of the 
riance in the effectiveness of group be- 
avior. With the use of various statistical 

lyses and the apparent reconciliations of 
discrepancies with other studies it becomes 
‘increasingly clear that in order to under- 
stand fully the various determinants of 
group performance one must examine the 
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leader's nature and behavior, the character 
of group member needs, and the kind of 
group task or goal. Laboratory research is 
required to test further the conclusions 
drawn and other field and laboratory studies 
can explore the effects on group effective- 
ness of interpersonal relationships similar 
to the reciprocality of indulgences. 
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HE present study explores some of the 
factors that determine how difficult a 
“classification will be to learn or remember. 
Ву a "classification" we mean, here, simply 
а grouping of a given set of stimuli into two 
for more mutually exclusive and exhaustive 
f classes. The learning or memorization of a 
ion can be regarded as a process of 
Fassociating, to each stimulus, a certain re- 
E This response might be the verbal 
label arbitrarily assigned to the class contain- 
ing that stimulus, or it might be the act of 
Sorting that stimulus into the bin arbitrarily 
Assigned to its class, The essential feature 
of a classification task, however, is that the 
“Same response is assigned to several differ- 
ent stimuli. Accordingly, we reserve the 
term identification task for cases in which a 
different response is paired with each stim- 
tlus. In either case, the word “memoriza- 
tion” is intended, here, to refer to those con- 
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ditions in which the materials to be learned 
are presented only once prior to the test for 
retention. 


Learning by Concept and Learning by Rote 


Tn general, since the stimuli that are classi- 
fied together need not be discriminated from 
each other, less information about a stimulus 
is required to classify it than to identify it. 
Therefore we might expect that classifica- 
tions would be more easily learned and re- 
membered than identifications. For example, 
if we have four horses and four dogs, we 
should certainly find it easier to remember 
one name for the horses and one for the 
dogs than to remember a different name for 
each of the eight individual animals. One is 
tempted to say that the difference, here, is 
between learning by concept and learning by 
rote. Horses presumably have something in 
common (not shared by the dogs) such that, 
after one name has been learned for three 
horses, the extension of this same name to 
the fourth horse requires little if any further 
learning. In the case of identification learn- 
ing no such saving is possible. After a dif- 
ferent name has been learned for each of 
three horses, the association of the fourth 
name to the fourth horse must still be 
formed de novo, i.e., by rote. 

Unfortunately there are certain drawbacks 
to the use of a comparison between identi- 
fication learning and classification learning 
for the purpose of clarifying the relation 
between rote and concept learning. First, as 
Bricker (1955) has pointed out, the reduc- 
tion in the number of responses entailed by 
the conversion of the identification task into 
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a classification task also results in a change 
in the chance level of performance. For 
instance, with eight stimuli (as in our ex- 
ample), subjects (Ss) who responded com- 
pletely at random would on the average 
select the correct classifying response (out 
of the two alternatives) one-half of the time, 
but they would select the correct identifying 
response (out of the eight alternatives) only 
one-eighth of the time. Of course one could 
correct the obtained error scores for this dif- 
ference in chance level or else use a different 
measure (such as trials to criterion) for 
which the difference in chance level might 
not be as great. But, even if we were to 
substantiate in this way that classifications 
are easier than identifications, we should still 
have to sort out the contributions of two 
different factors. For the reduction that 
would presumably be found in the difficulty 
of classification learning could be a conse- 
quence of (a) the fact that the stimuli that 
are classified together have some property in 
common with which the classificatory re- 
sponse can be associated (without distin- 
guishing each stimulus from every other), 
or it could be a consequence of (b) the re- 
duction simply in the number of responses 
that must be “kept in mind.” Factor a seems 
central to concept learning, but b is pre- 
sumably more akin to the length-of-list 
factor investigated in studies of rote learn- 
ing. 

Actually, the difficulty of a classification 
task can be changed radically without alter- 
ing the set of stimuli or responses in any 
way but, rather, simply by modifying the as- 
signment between them. Surely we should 
have more difficulty in learning one name 
for two of the horses and two of the dogs 
and the other name for the remaining two 
horses and dogs than (as in the example con- 
sidered above) in learning one name for the 
horses and the other for the dogs. The cross- 
species classification evidently entails a 
larger component of rote learning and, 
might, indeed, be comparable in difficulty to 
learning a separate identifying response for 
each animal. Moreover the difference in diffi- 
culty of two such classifications could not be 
attributed either to changes in length of list 
or in chance expectation. Clearly, then, the 


extent to which the potential reduction in 
difficulty from identification to classification 
learning is realized depends upon how the 
stimuli are grouped together in their assign- 
ment to the responses. In particular, classi- 
fication learning (with a fixed set of stimuli 
and responses) has been conclusively demon- 
strated to proceed more rapidly when the 
responses are assigned on the basis of com- 
mon properties of the stimuli rather than in 
a completely arbitrary manner (French, 
1953; Smith, 1954). And the same kind of 
result is found whether the stimuli are four- 
letter sequences (French, 1951), playing 
cards (Rogers, 1952), irregular closed 
curves (French, 1953), or other geometrical 
figures (Metzger, 1958; Smith, 1954; 
Wolfle, 1932). Metzger and Smith distin- 
guish the two contrasting conditions of clas- 
sification learning as “systematic” or “struc- 
tured” concept tasks, on the one hand, and 
“random” concept tasks on the other. How- 
ever, since the word "concept" seems to us * 
to imply "systematic" or "structured," we 
prefer the somewhat more neutral word 
"classification" when conditions are included 
in which the stimuli are grouped by fiat. In 
any case, if the rote component of a classi- 
fication task can be substantially changed 
simply by regrouping the stimuli in their 
assignment to the responses, some under- 
standing of the relation between rote and 
concept learning might be gained by examin- 
ing the performance of Ss when the same 
set of stimuli is classified in different ways. 


Characterization of Classifications in Terms 
of the Dimensions and Values of the Stimuli 


In order to simplify the task of describing 
and controlling the properties of the stimuli 
we have confined our investigation to stimuli 
constructed by selecting one of two possible 
values on each of three different dimensions. 
For example, the dimensions might be size, 
color, and shape and the values on these 
might be large or small, black or white, and 
square or triangular. We then get the 2x 
2 X 2 or eight geometrical figures shown 1n 
the box labeled I in Figure 1. These eight 
stimuli can then be classified in a very large 
number of ways. However, in order t0 
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Six different classifications of the same 
(Within each box the four 
stimuli on the left belong in one class and the four 
stimuli on the right in the other class.) 


Fic. 1. 
Set of eight stimuli, 


equate the informational content of the dif- 
ferent classifications (Hovland, 1952), we 
use only dichotomous classifications in which 
four of the eight stimuli are assigned to 
опе response and the remaining four stimuli 
to the other." The number of different clas- 
Sifications of this kind is given by the num- 
ber of combinations of four things taken 
from eight: namely, 8!/(4!)* or 70. 

Six of these 70 possible classifications of 
the same eight stimuli are illustrated in the 
six boxes in Figure 1. In each case one 
response, say the letter A, might be assigned 
to the four stimuli on the left and another 
Tesponse, say B, to the four stimuli on the 
right. In Box I, then, the necessary and 
Sufficient condition for the correct applica- 
tion of Response A is simply that the stim- 
ulus be black. The dimensions of size and 


? The investigation of classifications of this kind 
(ie, in which stimuli taking on one of two values 
Оп each of a small number of dimensions are di- 
vided into two equal classes) was probably in the 
Zeitgeist, judging by the number of independent 
(and often unpublished) studies of such classifica- 
tions that were brought to our attention after com- 
Pleting the experiments reported here. Our own 
ecision to undertake an exhaustive exploration of 
all possible classifications of this kind grew, in part, 
Out of discussions with Alex Bavelas who was at 
that, time a member of the technical staff of the 
Bell Telephone Laboratories. We are much indebted 
to him for interesting suggestions along these lines. 


shape are irrelevant for this classification. 
Another classification of these same stimuli 
that would presumably be somewhat more 
difficult to learn and remember is illustrated 
in Box П. Here the necessary and sufficient 
condition for Response A is that the stim- 
ulus be either black and triangular or else 
white and square. In this classification only 
the dimension of size is irrelevant. The 
Classifications III, IV, and V will be dis- 
cussed later. The classification in Box УТ, 
however, represents an extreme case and 
should therefore be considered now. The 
necessary and sufficient condition for Re- 
sponse A in this classification is that the 
stimulus be either triangular, and large and 
black, or small and white; or else square, 
and large and white, or small and black. 
Here, none of the three dimensions is irrele- 
vant. If the difficulty of learning has any 
relation to the length of these rules, we 
should be able to demonstrate at least three 
levels of difficulty merely by changing the 
way in which these eight stimuli are classi- 
fied. Moreover, whereas the classification 
shown in Box I is a kind that has often been 
used to study the acquisition of concepts, the 
classification illustrated in Box VI might ap- 
proach in difficulty a rote identification task 
in which a different response must be asso- 
ciated with each of the eight stimuli. (It 
might be noted that, if the rule given for 
this classification in terms of the logical con- 
nectives of conjunction and disjunction is 
expanded, it is seen to be logically equivalent 
to a complete enumeration of the four 
stimuli to which Response A has been as- 


signed.) 


Six Basic Types of Classifications 


Fortunately not all 70 of the possible clas- 
sifications need to be examined separately; 
for they belong to only six basic types. And 
any classifications belonging to the same type 
are essentially equivalent. For example, a 
classification that depends upon the value of 
only one dimension can be regarded as the 
same general type of classification whether 
the critical dimension is that of color (as in 
Box I) or that of size or shape. Likewise 
the decision as to which of the two classes 
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shall be assigned Response A and which 
Response B seems insignificant. Generally, 
then, we are led to say that two different 
classifications are of the same type if and 
only if one can be obtained from the other 
simply by interchanging the roles of the 
three dimensions or by reversing the two 
responses. In this way the 70 different clas- 
sifications can be reduced to just six struc- 
turally distinct types. Each of the classifica- 
tions shown in Figure 1 is an example of a 
different one of these six types. Accord- 
ingly, we shall henceforth refer to the cor- 
responding types by the roman numerals I- 
VI. A detailed demonstration that there are 
just these six types will not be given here; 
it can be found in works on Boolean algebra 
and the theory of switching circuits (e.g., 
Higonnet & Grea, 1958, pp. 188-194). 

In the experiments to be described many 
different kinds of stimuli were used, but 
(with one exception) all can be character- 
ized in terms of three dimensions with two 
possible values on each. Hence we need a 
way of abstractly representing the eight 
stimuli and their six types of classifications 
without regard for the particular way in 
which the dimensions and values of the stim- 
uli are realized, physically, in any particular 
experiment. A useful way of doing this is to 
set up a correspondence between the eight 
stimuli and the eight corners of a cube so 


COLOR 


ear © 


a 
AC. 


Fic, 2, An abstract representation of the eight 
stimuli as the eight corners of a cube. 


SHAPE 


Fic. 3. The six basic types of classification rep- 
resented abstractly by coloring four corners of the 
cube black and the remaining four white. 


that the three dimensions of the cube repre- 
sent the three dimensions of the stimuli. 
Such a correspondence is illustrated, for the 
stimuli of Figure 1, in Figure 2. As can be 
seen, the four stimuli having any one prop- 
erty in common (i.e., having the same value 
on one dimension) all fall on one face of 
the cube. Furthermore, stimuli having two 
properties in common are separated by a 
single edge, stimuli having one property in 
common are separated by two edges (or a 
face diagonal), and stimuli having no prop- 
erties in common are separated by three 
edges (or a body diagonal). 

Any of the 70 possible classifications of 
the stimuli into two equal subclasses can 
then be indicated by coloring four of the 
eight corners black and the remaining four 
corners white, The abstract representations 
for the six classifications illustrated in Fig- 
ure 1 are shown in this way in Figure 3. 
Any of the other classifications can be ob- 
tained from one of these six simply by rota- 
tions and reflections of the cube. On the 
other hand, no two of these six can be ob- 
tained from each other by any combinations 
of rotations and reflections. 

The types of classifications differ in that, 
in order to classify the stimuli correctly, they 
require knowledge of the values on only one 
dimension, for Type I; two dimensions, for 
Type II; or all three dimensions, for Types 
III-VI. Although these last four types аге 
alike in that all three dimensions are rele- 
vant, they differ structurally in certain ways 
to be considered later. = 

The three ensuing sections describe in de- 
tail three experiments designed, first of all, 


_ to compare the six basic types of classifica- 
tions with respect to how difficult each type 
is to learn or remember. In Experiment I 
the stimuli are presented successively ac- 
cording to the usual paired-associate proce- 
dure (except, of course, that there are only 
two responses). The measure of difficulty in 
this experiment is the number of errors 
made during learning. Experiment I also 
provides information about transfer of clas- 
- sification learning since Ss learn, in succes- 
sion, several classifications of the same basic 
_ type but using different stimuli. In both Ex- 
- periments IT and III, on the other hand, the 
Stimuli are presented simultaneously, as al- 
 теайу grouped into the two classes. Then, 
following a period of inspection, Ss either 
attempt to formulate a concise rule for how 
_ the stimuli can be sorted into the two classes 
_ or else attempt actually to so sort the stimuli. 
These two experiments also furnish infor- 
mation about how the way in which the 
dimensions and values of the stimuli are 
represented by the physical features of the 
stimuli affects the difficulties of the different 
- types of classifications. The main variation 
there is between “compact” stimuli (like 
those in Figure 1) in which all three dimen- 
Sions are represented by different aspects of 
the same object and “distributed” stimuli in 
which each dimension is represented by 
Variations in a different one of three spa- 
tially separated objects. 

After the summary of the empirical re- 
Sults of these three experiments (which im- 
Mediately follows the detailed presentation 
of Experiment III) we attempt to evaluate 
alternative theoretical notions about classi- 
fication learning with respect to their ability 
{0 account for the experimental results. In 
Particular, we shall argue that neither the 
Models of stimulus generalization nor those 
Of the conditioning of cues are alone suffi- 
Gient but that, in addition, something like 
abstraction and the formulation of rules is 
"apparently involved. 


EXPERIMENT I 


a The experiment to be reported first was 
designed primarily to answer two questions: 
Mow does the difficulty of learning vary 
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from one type of classification to another? 
Is something specific learned about the struc- 
ture of a classification that will transfer 
positively to the subsequent learning of a 
new classification of that same type? The 
experimental procedure conformed to the 
usual paired-associate paradigm except that 
only two responses were used. That is, the 
eight stimuli were presented, one at a time, 
in a continuing random sequence and an 
association between each stimulus and one of 
two alternative classificatory responses was 
built up by the method of anticipation. In 
order to obtain further information about 
the relation between identification and classi- 
fication learning with the same set of stimuli, 
though, a condition was also included in 
which a different response was assigned to 
each of the eight stimuli. 


Method 


Subjects. Six female freshmen at Fairleigh 
Dickinson University served for 15 hours each in 
this first experiment, These Ss were selected to be 
as uniform as possible with respect to their college 
entrance examination scores. 

Learning tasks. Each S went through a different 
sequence of 27 learning tasks called problems. 
During any one problem S learned to associate a 
prescribed verbal response to each of eight stimuli. 
In most of the problems one response (eg, A) 
was assigned to four of the eight stimuli and 
another response (eg. B) was assigned to the 
remaining four stimuli, Except in certain special 
problems, each of the eight stimuli took on one of 
two values on each of three dimensions. Thus any 
type of classification from T through VI could be 
established. The stimuli were photographed on in- 
dividual frames of 16-mm. film and projected onto 
a screen in front of S. A self-paced method of 
anticipation was used: ie, as soon as 5 responded 
to a given stimulus she was told what the correct 
response for that stimulus in fact was and then the 
film was advanced to the next frame. Each of the 
eight stimuli on a given film occurred 50 times 
making a total of 400 frames. The order of the 
stimuli on each film was random except for the 
following constraints: within the first two blocks 
of 8 frames, each stimulus appeared exactly once; 
within every succeeding block of 16 frames, each 
stimulus appeared exactly twice. For each problem, 
learning continued until 5 attained a criterion of 
32 consecutive correct responses. 

Stimuli. Each of six film strips was prepared 
from a different set of eight stimuli. Figure 4 
shows the eight stimuli used in one of these film 
strips. Each of the three positions in a stimulus 
represented a dimension in which either of two 
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thematically related drawings could appear as 
values, The arrangement of the eight stimuli in 
Figure 4 is analogous to the arrangement of the 
stimuli in Box 1 of Figure 1. Thus a Type I 
classification could be established by assigning 
Response А to the four stimuli on the left and 
Response B to the four stimuli on the right. In 
this case S would merely have to note whether 
the stimulus contained a candle or a light bulb in 
the lower left position in order to master the clas- 
sification. These pictorial stimuli (rather than the 
simple geometrical figures of Figure 1) were 
selected for this first experiment in order to sim- 
plify the task of constructing the many different 
but comparable sets of stimuli required by the ex- 
perimental design. Thus it was relatively easy to 
prepare another film in which either a desk or a 
chair appeared in the top position, an eye or an ear 
in the lower left, and a spool of thread or a pair 
of scissors in the lower right, and so on for the 
other films. One of the six films was constructed 
in a different manner. The same kinds of drawings 
were used, but an entirely different set of three 


Var 
$| [V 


Fic. 4. One of the sets of eight stimuli used in 
Experiment I. 


drawings was selected for each of the eight stimuli, 
Since no two of the eight stimuli had any picture 
in common, this is referred to as the "nonoverlap" 
film. 


Experimental design. Table 1 presents the over- 
all design. There were four general classes of 
problems. Two of these used the five films with 
“overlapping” stimuli (i.e, stimuli with two values 
on three dimensions) ; they were identification prob- 
lems (IDa, for “identification with overlap") іп 
which a different response-letter (D, H, K, M, O, 
R, S, or W) was associated with each of the eight 
stimuli, and classification problems (I, IT, III, IV, 
V, or VI) in which one of two responses was 
associated with each of four of the eight stimuli. 
Each classification problem used one of five alter- 
native sets of responses: “А” and “В,” "plus" and 
"minus," “Р” and “О,” "one" and "two," or "X" 
and “Y.” The other two general classes of prob- 
lems used the nonoverlap film; they were identifica- 
tion problems (IDs) in which the eight letters of 
the alphabet were assigned to the eight stimuli, and 
classification problems (Cs) in which each of two 
responses was associated with four of the eight 
stimuli. (Since none of the stimuli on the non- 
overlap film had any common properties, all Cs 
problems are of the same type.) 


As indicated in the table, each of the six Ss was 
given five consecutive problems of one type, then 
five consecutive problems of another type, and so 
on for four different types. A different film (i.e, 
set of stimuli) was used for each of the five classi- 
fication problems of the first type administered to 
each S. The order of these films, however, was 
different for different Ss. On subsequent types of 
problems the five films were used again for each S 
in the same order in which they had been presented 
during the five problems of the first type for that 
S. The order of the films for the identification 
problems (ID,) was the same except that it began 
with the assignment of the first film to Problem 24 
and ended with the assignment of the last film to 
Problem 1. The one nonoverlap film was used for 
both Problems 22 and 23. For each S the same set 
of responses was retained throughout the problems 
of a single type, but changed when the 5 proceeded 
to the next type. Indeed all Ss had A and B as 
responses for the first five classification problems, 
plus and minus as responses for the next five, and 
so on. For each problem the assignment of the 
response to the stimuli was chosen at random from 
all possible assignments that would conform to the 
type prescribed by the design. Thus for a given 5 
on the first problem of Type II the lower right 
drawing might be the irrelevant one whereas on 
the second problem of Type II the upper drawing 
might be the irrelevant one. 

Every S learned five problems of each of the 
Types I, II, and VI but (owing to limitations of 
time) five problems of only one of the Types TII, 
IV, and V. The sequence of types was different 
for each S but counterbalanced to the extent that 
each type occurred as frequently toward the begin- 
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ning as toward the end of the series of classification 
problems. (Unfortunately a complete counter- 
balancing could not be achieved with the number of 
Ss available. Hence there is a partial confounding 
of types and order of presentation of types.) Since 
the Ss could usually be scheduled for only an hour 
at a time, the following rules were adopted: If an 
S reached criterion on a given problem early in the 
hour, she was started on the next problem. If she 
reached criterion late in the hour, she was not given 
a new problem until the next session. And if (as 
occasionally happened) she had not reached crite- 
rion by the end of the hour, she was continued on 
that same problem at the beginning of the next 
session. All Ss appeared three times per week at 
roughly regular intervals throughout each week. 


Instructions. The nature of the learning tasks 
was explained to each S, but nothing was said 
about the structures of the different types of clas- 
sifications (I-VI). The experimenter (E) simply 
stated that S would learn five problems that were 
similar and of about the same level of difficulty, 
then five more problems that again were similar 
among themselves but different from the first five 
problems, etc. The statement was also made that 
the problems in each block of five might be more 
difficult or less difficult than the problems in the 
preceding block of five. Each S was urged not to 
discuss the problems with other Ss until after the 
experiment was completed. (Also, the variation of 
the order of the films, order of types of problems, 
and assignment of responses from one S to another 
Presumably minimized the opportunity for com- 
munication of this kind.) 

At the beginning of the first problem for each $ 
the manner of construction of the eight stimuli 
was described. And, at the outset of that and each 
subsequent problem, the eight stimuli were shown 
one at a time. Before each problem S was also 
told what the set of responses for that problem 
would be. During identification learning S was 
not required to guess the responses for the first 


eight stimuli, but was simply told what these were. 
During classification learning, though, there were 
only two responses and S was required to guess 
from the outset. Each time a new problem was 
begun that was of the same type as the preceding 
problem, this fact was pointed out to S. Likewise, 
when the problem was of a new type, S was told 
that, although the stimuli would be the same as 
those she had already seen in an earlier problem, 
the problem itself might be quite different from 
the preceding problems. 

After criterion was reached on any problem, E 
informally asked 5 whether she had any observa- 
tions to report concerning the problem or how she 
had gone about learning the responses. In order to 
minimize the influence of this inquiry upon the 
subsequent behavior of the S in other problems, 
suggestions that 5 should be able to verbalize a 
rule relating the responses to the stimuli were 
avoided. For the same reason, if S had nothing or 
only vague observations to report, no attempt was 
made to press for further explanation, Consequently 
the record as to the rules formulated by Ss during 
learning does not provide detailed information 
about all subject-problem combinations. 


Results 


This section presents the results of Ex- 
periment I in detail. A general summary of 
the results of this experiment (as well as of 
the other two experiments, II and III) will 
be found in the section Discussion of Em- 
pirical Results which immediately follows 
the detailed presentation of the results of 
Experiment III. 

Relations between measures of problem 


difficulty. The following four measures of 
performance were taken: the total time (in 


TABLE 1 


EXPERIMENTAL DESIGN 


Successive problems 

Ss 

1 2-6 7-11 12-16 17-21 22 23 24-27 
Sı ID; I II (V) VI Cs IDs ID, 
Sı ID; VI ain П І С л ID; 
Ss IDo I VI п (IV) Cs IDs IDo 
5, IDo (IV) II VI I Cs IDs IDo 
Ss ID, II (V I VI Cs IDs IDo 
Ss ID, VI I (III) II б IDs IDo 


Note.—] ii i f classification problems. During this sequence each S was given each of 
the в E VI but ny one ce the three Types III, IV, and У. These latter types are enclosed in parentheses fn 


order to set them apart in the table. 
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minutes) required to reach criterion (t), the 
total number of stimulus presentations ex- 
cluding the 32 presentations within the crite- 
rion sequence itself (р), the total number of 
incorrect responses (errors) made prior to 
reaching criterion (е), and the number of 
errors during the first 32 presentations (f). 
The correlations between these measures 
(over-all 162 subject-problem combinations) 
were as follows: rip = 0.93, rte = 0.94, ri; = 
0.67, тре = 0.90, rj; = 0.63, and rey = 0.77. 
The f measure was probably the least reliable 
of the four since it was based on only the 
first 32 presentations. This may account for 
its low average correlation with the three 
other measures (viz., 0.69). Nevertheless, 
all four measures were found to have essen- 
tially the same relations to the independent 
variables of the experiment. Rather than 
carry along all four, then, only the total 
number of errors, е, will be used as the de- 
pendent variable in what follows. This 
measure had the highest average correlation 
with the other three (viz., 0.87) and has 
been used in previous studies of classifica- 
tion learning (cf. Bourne & Restle, 1959; 
Smith, 1954). 

Transfer of classification learning. Fig- 
ure 5 presents the mean number of errors 
per S on the 20 principal classification prob- 
lems (Problems 2-21 of Table 1). Beyond 
the over-all (secular) decline throughout the 
20 problems, there was a pronounced drop in 
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Fie. 5. Mean number of errors for the main 
sequence of 20 classification problems. (Problems 
2-21 in Table 1. Points corresponding to problems 
of the same type are connected by lines.) 
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SUCCESSIVE PROBLEMS OF THE SAME TYPE. 


Fic. 6. Mean number of errors for the first 
through fifth classification problems of each type. 
(Separate curves are plotted for Types I, II, and 
VI, but a single curve for Types III, IV, and V 
combined.) 


errors during each block of five consecutive 
problems of the same type. This specifically 
within-type transfer occurred even though 
the stimuli and the assignment of responses 
changed for each new problem. What trans- 
ferred, then, was neither a particular set of 
stimulus-response associations nor simply а 
generalized increase in ability to handle new 
problems; it was something about the unique 
structure of the type itself. However, as 
will be seen, most of this within-type trans- 
fer can be attributed to Type VI classifica- 
tions alone. 


Comparisons between the six types of 
classifications. Figure 6 shows how the over- 
all difficulty and the within-type transfer 
varied from one type of classification to 
another. The individual curves for Types 
III, IV, and V are not presented separately. 


3 Also, of course, the sudden increase in errors 
whenever a new type of classification was intro- 
duced might in part be a consequence either of an 
emotionally disrupting effect of the instruction that 
the next problem would be of a new kind, or else 
of some interference resulting from the repetition 
of the set of stimuli used five problems earlier with 
different responses. However, these explanations 
seem implausible in view of the presence of @ 
strong interaction between positive transfer and _ 
type of classification, the total absence of overt 4 
intrusions of previously correct responses during 
the new problem, and the subsequent reports of the 
Ss themselves. 
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Since they were based on only two Ss each, 
they are quite erratic and their inclusion 
would therefore tend to obscure any pattern 
in the more stable curves. For this reason 
and because theoretical considerations indi- 
cated that these three types might be about 
equal in difficulty, they were averaged to- 
gether to yield a single curve that is com- 
parable in stability with the curves for 
Types I, II, and VI.* 

The reliability of the pattern exhibited in 
Figure 6 was evaluated as follows: First, 
the number of errors, e, made by each S on 
each of the 20 classification problems was 
transformed to yield a new difficulty score e^ 
by the commonly used logarithmic transfor- 
mation e’ = log, (e + 1). The log trans- 
formation (used also in Smith's study, 1954) 
largely eliminated the initially apparent de- 
pendence of the variance of errors upon 
their mean value. Then, a second-order 
orthogonal polynomial was fitted to the five 
transformed points in each one of the 6 X 4 
cells corresponding to a different subject- 
type combination. Thus the set of five error 
Scores in a given cell was reduced to a set of 
three coefficients: the mean value of e” for 
the given S on the five problems of the given 
type, the linear trend in e' over the five 
problems, and the quadratic curvature of 
that trend. Finally, an over-all analysis of 
Variance was carried out on these coeffi- 
cients.* 

The analysis indicated that the four curves 
in Figure 6 do differ reliably in over-all level 
(F = 45, p « .05), in linear trend (F = 
80, p < .01), and in curvature (F = 6.4, 
b < .01). The average left-to-right trend 
of all these curves taken together is also 
Statistically significant both in its linear com- 


* Unfortunately the experimental design does not 
Permit an adequate test of possible differences be- 
tween the individual curves for Types Ш, IV, and 
V. With the data that were obtained, however, the 
curves for III and V did not exhibit any consistent 
differences either in over-all height or in trend. 

he curve for IV did generally fall somewhat 
below the curves for III and V; but the two Ss 
who were given Type IV problems happened to be 
the two who made the smallest number of errors 
Оп the other problems also. About all that can be 
Claimed at this point is that the results are at least 
Consistent with the conclusion (established more 


ponent (F = 26.4, p < .01) and in its curva- 
ture component (F = 8.4, p < .05). How- 
ever, as is clear from Figure 6, this average 
linear trend and curvature as well as the 
differences between types with respect to 
trend and curvature is largely attributable to 
the very marked decline and concavity of the 
curve for Type VI alone. Order of pre- 
sentation of types of classifications had no 
significant effect upon the height of the 
curve for each type (F = 1.0), the linear 
trend (F = 3.4), or the curvature (F = 
3.0). The fact that the left-to-right linear 
trend in Figure 6 is significant whereas the 
effect of order of presentation of types is 
not significant further supports the conclu- 
sion, drawn from Figure 5, that the within- 
type positive transfer was reliably greater 
than the between-type positive transfer. Ina 
somewhat more rigorous test of this point, 
the over-all downward trend of the curves 
in Figure 6 was found to be significant even 
after the linear component of the order 
effect (i.e. the secular decline in Figure 5) 
was subtracted out. (Again, however, this 
within-type trend is largely contributed by 
Type VI alone.) Finally, the six Ss did not 
differ significantly either in over-all per- 
formance (F = 2.4) or in the linear trend 
of their performance over a series of prob- 
lems of the same type (F = 1.7). Oddly, 
however, they did differ in the quadratic 
component of this trend (F = 8.3, p < .01). 
This last result appears to be primarily at- 
tributable to one 5 (viz., $,) whose curves 
were all convex (rather than concave) up- 
wards. 

Since the four curves presented in Fig- 
ure 6 evidently do differ reliably, a more 
detailed examination of these differences was 


firmly in the following experiments) that Types 
III, ТУ, and V are essentially equal in difficulty, 


5 The fact that the experimental design entailed 
a partial confounding of types and order of pre- 
sentation of types necessitated the computation of 
certain correction coefficients in order to render 
orthogonal any comparisons involving these two 
variables. We are greatly indebted to M. J. R. 
Healy who, while a visiting member of the Bell 
Telephone Laboratories, proposed the method of 
analysis and derived the formulas for the required 
correction coefficients. 
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undertaken. With respect to the first prob- 
lem learned of each type (the left-most 
point of each curve), the results of the com- 
parisons between types (using two-tailed f 
tests) were as follows: the first point for VI 
was significantly different from the point for 
III, IV, and V combined (p < .01); the 
point for III, IV, and V, in turn, was differ- 
ent from the point for II (p < .05); the 
difference between the first points for II and 
I, however, was not significant. The differ- 
ences in the initial difficulties of the six types 
cannot be attributed to interference or nega- 
tive transfer from preceding classification 
problems; for the same ordering is found 
when we look at the first problem of only the 
first type learned by each S. Thus the aver- 
age number of errors made on the very first 
classification problem (Problem 2 in Table 1) 
was 7, for Type I; 29, for Type II; 41, for 
Type IV; and 86, for Type VI. (None of 
the Ss had Types III or V first.) 

The comparison of the differences be- 
tween types for the points corresponding to 
the second through fifth problems of each 
type is complicated by the significant de- 
pendence of within-type trend upon type. 
However, for the purposes of the subse- 
quent theoretical discussion, it is sufficient to 
observe that the downward trend for VI is 
significantly greater than that for any other 
curve. Thus the results are consistent with 
the hypothesis of an initial ranking: 1< II 
« (III IV, V) « VI. But they also indi- 
cate that, with continued practice, VI de- 
creases in difficulty relative to the other 
types and, so, eventually becomes easier than 
IIT, IV, and V (considered together). 


Analysis of rules verbalized by Ss. For 
all but four of the subject-problem combina- 
tions S$ described the classification in terms 
of an explicit rule. All but five of these ex- 
plicit rules proved to be correct in the sense 
that, by sorting the stimuli in accordance 
with the given rule, E could reconstruct the 
correct classification. All of these rules were 
rated for amount of unnecessary complexity; 
ie., the amount of complexity of the stated 
rule over and above that of the least com- 
plex rule possible for the given type of clas- 
sification. Some of the types admit two 
different rules that seemed about equally 


economical Examples of each of the kinds 
of rules that were taken to be most econom- 
ical for each type of classification are as 
follows: 

Type I: “If there's a candle it's an A; otherwise 
BA 

Type II: “If there's a candle and trumpet or if 
there's a light bulb and violin it's an A; otherwise 
B” 

Type Ша: “If there’s a candle but not both a 
violin and screw, or if there's a violin and nut, it's 
an A; otherwise B." 

b: “If there's a candle and trumpet or if there's 
a violin and screw it's an A ; otherwise B." 

Type IVa: “If there's a candle but not both a 
violin and screw, or if there's a trumpet and nut, 
it's an A ; otherwise B." 

b: "If there's a candle, violin, and screw or 
any two of these it's an А; otherwise B." 

Type V: "If there's a candle but not both a 
violin and screw, or if there's a violin and screw 
but not a candle, it's an A; otherwise B." 

Туре Via: “If there's a candle, violin, and screw 
or just one of these three it’s an A; otherwise B.” 

b: “If either just one or else all three pictures 
change the response changes to the other alterna- 
tive; otherwise the response remains the same.” 


The rule for Type I simply specifies the 
values on the one relevant dimension. The 
rule for II does the same for the two rele- 
vant dimensions. The rules for IIIa, IVa, 
and Va are of the same general kind. They 
might be called “single dimension with ex- 
ceptions" rules in that they specify not only 
the values on one relevant dimension (as in 
Type I) but also the two exceptional stimuli 
for which the responses must be reversed." 
The rule for IIIb is similar to that for IIb 
except that three rather than two dimensions 


9 In general this kind of rule is of this form: “If 
there's a candle it’s an A; otherwise B; except the 
one made up of a candle, violin, and screw must be 
exchanged with the one composed of a light bulb, 
trumpet, and nut.” The specific forms of this 
“single dimension with exceptions” type of rule that 
were given above differ from this general form in 
that they take advantage of certain subtle differ- 
ences in the structures of Types III, IV, and V in 
order to shorten the rules slightly by omitting men- 
tion of one value for each of the two exceptional 
stimuli. However, Ss were considered to have 
formulated the simplest rule for Types III, IV, or 
V even if they mentioned all three values of the 
exceptional stimuli as in this general form of the 
rule. 
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are involved. The vocabulary is coordinate 
for all the rules just considered; but for the 
further rules IVb, VIa, and VIb—which we 
shall sometimes refer to as the “odd-even” 
rules—a new process of counting enters. 
(As indicated in the introduction, by the 
length of the rule for VI given there in 
terms of a logical conjunction and disjunc- 
tion of values, the rule for this type is ex- 
tremely complicated if counting is not used.) 
The rules IIIb and VIa are similar in that 
the classification is defined in terms of the 
three pictures contained in a single “pivotal” 
stimulus. However, they differ in that, 
whereas any stimulus can be chosen as 
pivotal in VI, only two of the eight stimuli 
will serve in III. The final rule, VIb, is 
unique in that it involves a comparison of 
two consecutively presented stimuli and, so, 
would not provide any basis for responding 
to the first stimulus presented. 

Three judges (JG, HMJ, RNS) inde- 
pendently rated the amount of complexity of 
the rules stated by the Ss over and above 
that of these most economical rules. The 
rating was done in random order with 
knowledge of the type of classification in- 
volved but without knowledge about which 
of the six Ss produced the rule or how many 
problems had already been learned by that 5. 
The rating was done on a five-point scale 
according to the following guide: 

0. Equivalent to one of the most economical 
Statements of the given type of classification. 

1, As above, but either includes the complete 
tule for both responses (rather than simply spe- 
cifying the application of the second response by 
exclusion) or else states the values on an irrelevant 
dimension, but not both of these. (Example of 
Stating an irrelevant dimension in Type I: “If 
there's a candle and a violin or trumpet it's an A; 
otherwise B.”) 

„2. Fails to use exclusion and includes irrelevant 

imension; or repeats some information unneces- 
sarily; or is incomplete. 

Sand4. Increasing degrees of complexity. 

5. Enumerates all four stimuli in each class or 
States a rule judged to be equivalent to such an 
enumeration in terms of the number of dimensions 
and values specified. 

The three pair-wise correlations between 
the three judges’ ratings of the amount of 
Unnecessary complexity of the stated rules 
Were .94, 90, and .89. Since the judges 


seemed to produce similar ratings, their rat- 
ings were averaged for comparisons with 
other variables. Their mean ratings of un- 
necessary complexity for Types I, II, III, 
IV, V, and VI were 1.6, 1.7, 2.9, 1.8, 3.2, 
and 2.3 (in that order). For purposes of 
comparison, the number of errors made dur- 
ing the learning of classifications of each 
type (averaged over all five problems of the 
same type) were 8.3, 13.2, 32.0, 16.7, 23.6, 
and 28.0, respectively. The correlation be- 
tween these two sets of numbers is statis- 
tically significant (r = .80, p < .05). There 
was also a decrease in the rated complexity 
of rules in successive problems of the same 
type. This is shown, along with the corre- 
sponding reduction in errors (for all types 
taken together), in Figure 7. The similarity 
of the two curves suggests that there may 
be a close relation between the reduction of 
errors and the discovery of a more econom- 
ical rule. Type VI showed the greatest re- 
duction in complexity of stated rule, just as 
it showed the greatest reduction in errors, 
during the course of the five successive 
problems. 


Identification and nonoverlap conditions. 
In all conditions there were only eight dif- 
rent stimuli; and these were highly discrim- 
inable in the sense that the difference be- 
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Fic. 7. Mean number of errors and mean rating 
of the amount of unnecessary complexity in the 
rule stated by Ss for the first through fifth prob- 
lems of the same type. (The average is taken over 
all six types of classifications, The location of the 
zero points and the size of the units have been 
adjusted to bring the two curves into approximate 


alignment.) 
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tween any two could easily be seen and 
described by Ss. Moreover these stimuli 
were meaningful in the sense that they de- 
picted familiar objects for which the Ss al- 
ready had overlearned verbal labels. Now a 
paired-associate list composed of only eight 
highly discriminable and meaningful stimuli 
would seem to be relatively easy. Yet the six 
Ss found most of the classification and 
identification problems quite difficult. On 
the average, they mastered the first identi- 
fication problem (IDo) only after 59 min- 
utes, 249 stimulus presentations (which is 
equivalent to 31 times through the "list"), 
and 103 incorrect responses. The reason for 
the difficulty of this task is made clear by a 
comparison with the corresponding non- 
overlap condition (IDs). This problem was 
mastered, on the average, after only 3.7 min- 
utes, 4.7 stimulus presentations, and 0.8 in- 
correct responses. Of course, of these two 
identification problems, the one with over- 
lapping stimuli always preceded the one with 
nonoverlapping stimuli, but this cannot alone 
account for the hundredfold difference in 
errors. For the average number of errors on 
the last four identification problems with 
overlapping stimuli (IDs) was still 23 (as 
uy to 0.8 for the nonoverlapping stim- 
uli). 

The results, then, support the following 
account: The identification problem with 
nonoverlapping stimuli was relatively easy 
because 7 could associate the response for 
each stimulus with each of the component 
pictures of that stimulus independently. In 
fact, the problem could be mastered by at- 
tending to the picture in only one (say the 
lower left) position. And the pictures ap- 
pearing in this position were indeed highly 
meaningful. Owing to the overlap in the 
component pictures in the contrasting identi- 
fication problem, however, S could never 
achieve a satisfactory performance by at- 
tending to the picture in a single position. 
Rather, 5 would have to learn to identify the 
unique pattern corresponding to each of the 
eight combinations of three pictures. And, 
although the individual pictures were mean- 
ingful, their combinations were not. More- 
over, the overlearned verbal labels (e.g., 


“candle,” “violin,” etc.) were of little help; 


stimuli that shared pictures would also share 
these labels. A verbal response uniquely as- 
sociated with each combination of three pic- 
tures presumably would have been helpful, 
but this S did not initially have. The situa- 
tion here is analogous to a rote learning task 
with nonsense syllables as stimuli. There, S 
has an overlearned response to each indi- 
vidual letter, but not to the whole pattern of 
three, $ 

The comparison between the identification 
and classification problems with nonover- 
lapping stimuli is unfortunately confounded 
completely with order effects; for, whereas 
the same set of nonoverlapping stimuli was 
used for both conditions, the classification 
problem always preceded the identification 
problem. Still, in view of the absence of a 
strong order effect among the other classi- 
fication problems, it seems rather surprising 
that the classification problem with nonover- 
lapping stimuli resulted in an average of 67 
errors as opposed to the average of only 08 
errors for the corresponding identification 
problem. 


EXPERIMENT II 


This experiment was designed to secure 
more systematic information concerning the 
kinds of rules spontaneously formulated by 
Ss in categorizing stimuli arranged accord- | 
ing to the various types. In addition, infor- 
mation was sought on the following prob- 
lems: Is the difficulty of memorizing the 
classification into which the stimuli have 
been arranged related to the difficulty of 
formulating rules for their classification? 
Does the way in which the dimensions and 
their values are represented by features of 
the stimuli affect the difficulty of memoriza- 
tion or formulation of rules for a classifica- 
tion? 3 

These problems were investigated through 
the use of two tasks: rule formulation, in 
which Ss were first exposed to eight stimuli 
divided into two groups of four each on the 
basis of one of the various types of class 
fications discussed above and then asked to 
give the basis on which the two sets could be 
differentiated; and memorization, in which 
the same Ss were subsequently (1-2 weeks | 
later) exposed to the same divisions of the 
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stimuli into two groups but with instructions 
to study them until they had memorized the 
classification, and then were tested for their 
ability to sort the stimuli correctly into the 
two groups. 


Method 


Subjects. Data were secured írom 20 college 
students from an elementary psychology class at 
the University of Bridgeport. Four other students 
were dropped because of failure to follow the in- 
structions on preliminary trials. 


Stimuli. Classifications of Types I, II, IIT, V, 
and VI were employed. Three different sets of 
each type, except VI, were chosen in such a way 
as to counterbalance for each type, the roles played 
by the three dimensions. (Thus, for each of the 
three Type I problems, a different dimension would 
be the relevant one.) Only one problem of Type VI 
was given since all three dimensions play the same 
role in this particular type. Altogether 3 X 4 + 1 
or 13 different sets of eight stimuli were therefore 
utilized. 

‘The three dimensions and two values of the 
stimuli were represented by stimulus features in 
three different ways. These are illustrated in Fig- 
ure 8 where (for each of the three ways) the 
stimuli are arranged according to a Type II classi- 
fication (with the four stimuli on the left in one 
class and the four on the right in the other class). 


REPRESENTATION B 


REPRESENTATION C 


Fic. 8. Three different perceptual representations 
9f a Type II classification. 


We refer to the first way (A) as a "compact" 
representation since each stimulus is a single geo- 
metrical figure. The three dimensions of the stim- 
ulus are size, color, and shape. The corresponding 
values are large-small, black-white, circular- 
triangular. The second and third representations 
(B and C) are referred to as "distributed" since 
each stimulus consists of three separate geometrical 
figures (triangle, circle, square) each of which is 
used to encode one of the three dimensions. In B, 
each figure may be large or small giving us the two 
values for each dimension. In C, however, the 
values are represented in a different way for each 
figure. Thus, the triangle may be large or small, 
the circle may be black or white, and the square 
may be shaded or unshaded. (This last kind of 
representation, C, is presumably the most closely 
analogous to the stimuli used in Experiment I, 
which were all "distributed" in the present sense.) 

The choice of these three perceptual representa- 
tions was not based upon any well-defined theoret- 
ical predictions but was intended to provide some 
generality of materials and also to evaluate certain 
a priori expectations as to how the type of coding 
of the dimensions might affect performance, It was 
thought, for example, that the compact Representa- 
tion A might differ from the distributed representa- 
tions in that a single discrete response might be 
made to the single geometrical figure without hav- 
ing to respond successively to each of three spa- 
tially separated figures. Illustrative of another pos- 
sible influence of the perceptual representation 
would be the expectation that the Type VI problem 
would be easiest with Representation B, since here 
this otherwise difficult problem could be reduced to 
an odd-even rule based simply on the number of 
large figures. А final example is the possible 
formulation of a rule for Type II in terms of the 
"same" or "different" size of two of the figures 
(e.g, "when triangles and squares are same size, 
put on the right," etc. in Figure 8). It will be 
found later, however, that the differences between 
the two distributed representations were not gen- 
erally of significance, and in most of the subsequent 
discussion the results for these two representations 
(B and C) will be combined. The nature of the 
dimensions and values for the three different repre- 
sentations are specified in Table 2. 


Formulation of rules. "The first task for each S 
was that of formulating rules for categorizing the 
stimuli presented to him, Instructions were de- 
signed to exclude simple enumeration in terms of 
a complete specification of all values for each of 
the eight stimuli. 

The test booklet containing an arrangement of 
the eight stimuli into one of the 13 classifications 
was presented to S. The E had a set of eight 
cards, one for each stimulus, that he was prepared 
to sort into two piles on the basis of the rule given 
to him by S. The S was instructed as follows: 

T have a duplicate set of the cards which have 
been presented to you in two groups in booklet 
form. What I want you to do is to tell me which 
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TABLE 2 
NATURE OF DIMENSIONS AND VALUES FOR THE THREE DIFFERENT PERCEPTUAL REPRESENTATIONS 
Percep- Values Values | Values 
tual Dimension 1 Dimension 2 Dimension 3 
represen- 
tation 1 2 1 2 1 2 
A | Color of figure | white | black || Size of figure | small | large || Form of figure | triangle | circle 
Size of triangle | small | large | Size of circle | small | large | Size of square | small [агре 
Size of triangle | small | large || Color of circle | black | white || Shading of open shaded 
square 


figures belong in the A group, and which figures 
belong in the B group, You are not allowed to 
describe particular cards. You must give general 
descriptions of categories or classes of cards. [E 
gives an example.] This is not an intelligence 
test; we are interested in how individuals de- 
scribe sets of items. 

Do you have any questions at this time regard- 
ing what you are to do? If not, we'll start with 
the first set of items. 


The exact rules given to E by S for categorizing 
the cards were recorded. A sample protocol for 
Type I might read: "Put all the large figures on 
the left and all the small on the right." The time 
elapsing between the opening of the test booklet 
and the statement of a rule by 5 was recorded. A 
second E recorded Ss rules and the time required 
for their formulation, 


If S gave a rule by which E could sort his set 
of cards, the rule was followed and recorded. If 
the rule was not clear or violated the instructions 
given to S, E challenged S, explaining that all the 
stimuli were not covered or that S was not follow- 
ing instructions—particularly when S only described 
the individual cards. The nature of the challenge 
and S's response to it were recorded. 


Memorization. Two weeks later each S was 
given a test of his speed and accuracy in memoriz- 
ing the assignment of stimuli according to the vari- 
ous types of classifications. In this test S was given 
a set of eight cards constituting the total set of 
stimuli. E then presented to S one of the 13 book- 
lets described above. S was instructed to study the 
booklet until he felt he could distribute his set of 
eight cards into two categories (left and right) in 
the same way as they were distributed in the book- 
let. His performance was timed and the distribu- 
tion of the cards into the two piles was recorded. 
Thereupon another set of instances was presented 
to S by E. The instructions were as follows : 


For this phase of the experiment you will be 
shown a booklet which contains eight cards, 


divided into two groups. When you feel that you 
are ready to sort your own cards tell me. I'll 
close the booklet and you pick up your cards and 
sort them into the same two groups. The order 
in which the cards appear (within each group) is 
not important. Just get the same cards into the 
same grouping. 

Don't pick up your cards until I have closed 
the booklet. Do you understand the task? Any 
questions? АП right, let's begin. 


E opened one of the test booklets and, when S 
indicated he was ready, E shut the booklet and 5 
sorted the eight cards into two piles from memory. 
The sorting was scored only on the basis of getting 
the stimuli correctly grouped together; whether one 
pile was placed on the right or left was not taken 
into account. (Since the order of the two groups 
was not scored, there were only 35 rather than 70 
possible classifications here.) The time elapsing 
between the presentation of the instances to 5 and 
the beginning of his sorting of the stimuli was 
recorded to the nearest second. 


Experimental design. In both the rule formula- 
tion and the memorization tasks each S received all 
13 different sets of instances in each of the three 
perceptual representations. The three representa- 
tions were contained in a latin square with approxi- 
mately one-third of the Ss receiving the compact 
Representation A first, one-third Representation B, 
and one-third Representation C. Ss for each of the 
three groups were assigned randomly. A similar 
latin square was employed for the memorization 
task except for the restriction that no S had the 
same order of representations in the two tasks. 
Within a particular kind of representation the 13 
sets of problems were given in random order. 

Prior to the beginning of the experiment proper 
a series of trial tasks was employed that had sub- 
stantially different geometrical figures but involved 
essentially the same kind of classifications as those 
used in the main experiment. 


39. 
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Results 


Formulation of rules. While a number of 
different kinds of rules could be formulated 
for these materials comparable to those dis- 
cussed under Experiment I, only three kinds 
constituted the bulk of the formulations actu- 
ally produced by Ss. Although it was for- 
bidden in the instructions, enumeration of 
stimuli was occasionally used—primarily in 
connection with Type VI. Examples of the 
various types of rules may be helpful. The 
following examples are taken from those 
given for the compact perceptual Represen- 
tation A: 

Single factor: “All circles on the left; triangles 
on the right.” 

Two-factor: “Small triangles and/or large cir- 
cles on the left; large triangles and small circles on 
the right.” 

Single factor with exception: “All black figures 

on the left and white figures on the right except 
the large black circle should be exchanged with 
the small white triangle." 
It will be recalled that the simplest rule for 
Туре I is the single factor rule; the simplest 
for Type II is the two-factor rule; for Type 
II either a two-factor formulation (involv- 
ing three dimensions) or a "single factor 
with exception" rule is possible; the simplest 
for Type V is the “single factor with excep- 
tion” гше. 

Results are shown in Table 3. The per- 
centages of Ss giving a correct rule of each 
kind for the various types of problems are 
presented.” The sets representing the same 
problem but with different dimensions util- 
ized as the basis for classification are indi- 
cated by 1, 2, and 3. General rules which 
were correct but not included as one of the 
three most common kinds are tabulated 
under “miscellaneous,” and correct state- 
ments of the classifications by complete de- 
Scriptions of the four stimuli in each group 
are tabulated under the column “enumera- 
tion.” “Errors” involve an incorrect state- 
ment of the rule(s) or a failure to formulate 
a rule. 


—————— 

"Тһе assignment of the rules produced by Ss to 
the various categories of Table 3 was based upon 
Agreement between one of the Es (CIH) and 

bert Bregman (whose assistance as a judge is 
&ratefully acknowledged). 


The principal phenomenon is the high rela- 
tive frequency with which Ss formulated ap- 
propriate simple rules (numbers enclosed in 
boxes). But the extent to which the simplest 
rule was produced varied greatly with type 
of classification and kind of perceptual rep- 
resentation. (The fact that it also varied 
somewhat from one set to another within a 
given type and perceptual representation 
may mean that the three dimensions differed 
somewhat in salience. However, this kind of 
variation is generally small compared with 
that attributable to type of classification.) 
For Туре I from 75-100% of the Ss (with 
an exception to be noted at the end of the 
paragraph) formulated a single-factor rule. 
The extent of use of the simplest rule was 
actually greatest (though not significantly 
so) for Representation C, where a different 
figure and kind of variation was used to rep- 
resent each dimension. It will be noted from 
the table that some Ss utilized a less efficient 
rule for Type I in which they described the 
stimuli in terms of two dimensions rather 
than the one required. For example, instead 
of specifying “АП circles on the left and all 
triangles on the right," they would say: “АП 
large circles and all small circles on the left, 
and large triangles and small triangles on the 
right." This accounted for 50% of the for- 
mulations in the second of the three Type I 
problems with the compact Representation 
A. 

With the compact Representation A for 
Type II from 65-90% of S's stated a simple, 
appropriate rule, one involving two factors. 
Thus almost as many S's formulated a simple 
rule for Type II as did so for Type I (with 
Representation A). The corresponding per- 
centages for Representations В and C were 
substantially lower (30-5546 correct) than 
for A, and were less than the corresponding 
percentages for Type 1. With these repre- 
sentations Ss appear to have considerable 
difficulty in discovering a rule for Type IT 
classifications. 

Results for Types III and V were closely 
similar to each other. There was a substan- 
tially smaller number of Ss using the simplest 
applicable rule than in the case of Types I 
and II. There were no clear differences 
among the three kinds of representations. 
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TABLE 3 


‘Types or CLASSIFICATION RULES USED FOR THE Six TYPES OF PROBLEM 


(Percentage of Ss utilizing each type of rule) 


Single Two Single factor Miscel- 
factor factor with exception — laneous 


Enumer- 
ation 


Errors 


Perceptual Representation A (Compact) 


п 


HI 


VI 


II 


ш 


VI 


п 


ш 


VI 
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The substantial number of "miscellaneous" 
classifications primarily comprises ones in 
which two of the four stimuli on each side 
were classified by a general rule and the re- 
maining two were then individually de- 
scribed. 

No S in any of the groups gave a correct 
general rule for Type VI. In violation of 
the instructions some of the Ss enumerated 
the specific stimuli which were assigned to 
each group. Although these enumerations 
were contrary to the instructions, they were 
accurate statements of the classification for 
30% of the S's with Representation A, 10% 
with C, and 096 with Representation B. 

These results are summarized in Figure 9. 
It will be seen that there was an over-all 
progression of reduced adequacy and accu- 
racy of the formulated rules with increased 
complexity of the classification. Accuracy 
was greater for the more complex classifica- 
tions when the dimensions were perceptually 
represented in compact form (A) than when 
the information was distributed over three 
figures (as in B and C). 

The length of time required to decide on 
an appropriate formulation was also ana- 
lyzed. These times show essentially parallel 
results, with longer formulation times for 
the more complex types of classifications. 
There is one interesting inversion, however, 
in that Ss took longer to formulate a rule 
for Type II than for III or V when percep- 
tual Representations B and C were used. 
With these representations the rule for 
Туре II seems extremely difficult to formu- 
late. The over-all phenomenon, however, is 
that Ss do not merely take a longer time to 
decide on an appropriate formulation but, 


8 The fact that none of the Ss in Experiment II 
discovered the highly efficient odd-even rule that 
was discovered by some of the Ss in Experiment I 
is probably a consequence of two differences be- 
tween the two experiments. First, the much larger 
number of learning trials for each problem as well 
as the consecutive presentation of problems of the 
same type provided much greater opportunity for 
the discovery of more effective rules in the first 
experiment. Second, the successive presentation of 
stimuli used only in the first experiment may have 
made it more likely that Ss would notice how many 
(eg. whether an odd or even number of) values 
changed from one stimulus to the next. 


even after this longer time, they still are less 
accurate in formulating the appropriate 
rules. 


Memorization. Results for the second 
task, that of remembering the category to 
which each of the eight stimulus cards had 
been assigned, are presented in Tables 4 and 
5. These data show a close relationship be- 
tween the type of classification and the accu- 
racy (Table 4) and speed (Table 5) with 
which the assignment of stimuli is memor- 
ized. Practically all Ss correctly sorted the 
stimulus cards into the appropriate groups 
after a 2-5 second period of inspection in 
the case of Type I classifications. Consistent 
with the previous results on adequacy of 
formulation of rules, the greatest accuracy 
is attained with perceptual Representation C. 
Performance with Type II problems was 
substantially poorer than for Type I and the 
difference was most marked for perceptual 
Representations В and С? Measures in 
terms of accuracy and those in terms of time 
for memorization are closely parallel. 

Problems of Type III and V did not differ 
among themselves but both were clearly 
more difficult than Type II. Again the dif- 
ferences between Type II and Types III or 
V were more marked with Representations 
B and C. 


9 None of the available statistical procedures is 
completely appropriate for frequency data (i.e. fre- 
quencies of correct formulation of rules or fre- 
quencies of correct memorization) when these 
frequencies are obtained from the same Ss under 
different conditions. The analysis mentioned below, 
however, revealed no transfer from one perceptual 
representation to another or from one type of prob- 
lem to another. Under these conditions the type of 
chi square analog of analysis of yariance developed 
by Lancaster and described by Sutcliffe (1957) pro- 
vides conservative estimates of significance since 
the remaining factor, individual differences, would 
certainly result in a still smaller error term. The 
difference between perceptual representations, ac- 
cording to this analysis, is significant at the .001 
level. (This difference is primarily attributable to 
the difference between the compact representation, 
A, and the distributed representations, B and C, 
taken together.) The difference attributable to type 
of classification is significant at the .001 level also. 
"The internal consistency of the data as well as their 
similarity to those of Experiments I and III attest 
further to their reliability. 
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Fic. 9. Percentage of Ss formulating correct 
rules for each type of classification. (Shading indi- 
cates percentage formulating most efficient rule.) 


Still more difficult in terms of accuracy 
and time for memorization is Type VI. The 
distributed perceptual representations (B 
and C), composed of three figures, once 
more resulted in greater difficulty than the 


TABLE 4 


ACCURACY IN MEMORIZATION TASK: EXPERIMENT II 
(Percentage of Ss correctly assigning stimuli from 
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representation in compact form (A). The 
difference in accuracy score between Repre- 
sentations B and C was in the direction pre- 
dicted from the greater availability of the 
odd-even rule (discussed above) for Repre- 
sentation B, but the difference between one 
and four (out of a possible 20 Ss) is not 
statistically significant and we did not secure 
confirming evidence for its utilization in the 
rule formualtion task. 

The results for accuracy and speed of 
memorization are shown graphically in Fig- 
ure 10 where the data for the three kinds of 
perceptual representations are combined. 
The same relationship between accuracy and 
inspection times that was obtained in the 
case of the rule formulation task also was 
obtained here; i.e., accuracy was lower for 
the difficult problems even after additional 
time had been spent in assimilating the infor- 
mation. 


TABLE 5 
Time REQUIRED TO MEMORIZE STIMULI: 
EXPERIMENT П 


(Median number of seconds elapsing between 
presentation of stimuli and S’s sorting of 


memory) perceptual stimulus cards) 
Problem Perceptual Representation Problem Perceptual Representation 
A B C A B G 
Type Set (Distrib- (Distributed: Туре Set (Distrib- (Distributed: 
(Compact) uted:same different (Compact) uted:same different 
values) values) values) values) 
Dred 100% 100% 100% Т 2 4 4 
2 100 80 100 2 4 5 4 
3 90 85 100 3 4 414 3 
TIT 85 60 70 п 7 21 23 
2 70 60 70 2 9 23 27 
3 90 15 80 3 1% 17 20 
TI 65 50 35 ш 1 1114 2114 21 
2 60 60 50 2 12 24% 31 
3 65 30 50 3 11 234 21% 
УТ 60 35 45 Уч! 12 2714 22% 
2 55 60 50 2 10 19 26 
3 85 30 35 3 9 2714 31 
VI 55 20 5 VI 1814 51 42 
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Fic. 10. Accuracy (percentage of Ss correctly 
assigning stimuli to appropriate category) and 
speed (time required to memorize assignment) for 
various types of classifications. 


The difference between the perceptual 
representation in compact form (A) and 
that distributed over three figures (B and 
C) is summarized graphically in Figure 11. 
The differentiation between types of classi- 
fications will be seen to be more marked in 
the case of the distributed Representations B 
and C than in the case of the compact Rep- 
resentation A. 

Analysis was made of performance when 
the same task was carried out during the 
first, second, and third portions of the test- 
ing cycle. No significant improvement was 
found attributable to prior experience with 
the problems presented in other perceptual 
representations. Similarly, no improvement 
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Fic. 11. Percentage of Ss assigning stimuli to 
Correct category for various types of classifications 
With compact (A) and distributed (B and C) per- 
Ceptual representations. 


on successive problems within a particular 
kind of representation was found. 


Relationship between difficulty of formu- 
lation of rules and of memorization, The 
data presented in Table 3 were correlated 
with the corresponding results in Table 4 to 
determine the relationship between the ex- 
tent to which the simplest rule was formu- 
lated in the first task and the accuracy of 
sorting from memory in the second task. A 
correlation was computed for 39 entries; i.e., 
for the 13 pairs of values for each of the 
three perceptual representations. The corre- 
lation was .90 and the regression was linear. 
This appears to indicate that a common factor 
is responsible for the increase in difficulty of 
rule formulation and of memorization. 


Experiment III 


It might be contended that the differentia- 
tion in memory scores for the various classi- 
fications of stimuli in Experiment IT was at- 
tributable to the prior experience of formu- 
lating rules for classifying these stimuli. As 
a check on this possibility the memorization 
task alone was repeated with a new group of 
Ss having no prior experience in formulating 
rules for classifications, Otherwise, essen- 
tially the same procedures were used. 


Method 


The procedures for the memorization task were 
identical to those described for Experiment II ex- 
cept that all six types of classifications (including 
IV) were employed. Two sets of Types I, II, III, 
IV, and V and one of Type VI were memorized 
by each S. The three perceptual representations 
illustrated in Figure 8 were again used. 

The Ss were 26 students from the elementary 
psychology course at the University of Bridgeport. 


Results 


Data concerning accuracy in memorization 
of the six types of classifications are given 
in Table 6. Corresponding results on time 
required to memorize the classifications are 
presented in Table 7. The increase in diffi- 
culty with increased complexity of type of 
classification is again evident. Type I was 
the easiest to memorize, Type II was next, 
and Type VI was the most difficult. As be- 
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ACCURACY IN SORTING FROM MEMORY: 
EXPERIMENT III 
(Percentage of Ss correctly assigning stimuli from 
memory) 


Problem Perceptual Representation 
A B С 
Туре беї (Distrib- (Distributed: 

(Compact) uted: same different 

values) values) 
м sed: 96 89 100 
2 89 62 100 
IL 81 62 54 
2 96 54 42 
ш d 62 50 46 
2 58 54 23 
WA 35 50 42 
2 65 33 50 
У i 50 27 38 
2 73 42 38 
VI 62 23 31 


fore, the greatest accuracy on Type I was 
achieved with the distributed Representa- 
tion C. The decrease in accuracy with in- 
creased complexity was quite gradual for the 
compact Representation A but was more 
pronounced with the two distributed Repre- 
sentations B and C.*° Clearly, prior practice 
is not the explanation for the relationship 
between complexity of classification and ac- 
curacy in memorization since the results are 
remarkably similar to those in Experiment 
II. The only appreciable effect of prior ex- 
perience with the task of formulating rules 
appears to be a reduction in the time ex- 
pended in memorizing Type I classifications. 


10 The differences between perceptual representa- 
tions and the difference between types of classifica- 
tions are both significant at the .001 level according 
to the chi square analysis (the use of which was 
described and justified for these kinds of data in 
Footnote 9). 
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Discussion or EMPIRICAL RESULTS 


We shall now try to summarize and tie 
together the results of the three experiments, 
which have just been described in detail, and 
to relate these results to those obtained in 
earlier investigations of the learning of clas- 
sifications. A central feature of the three 
present experiments is that they attempt a 
systematic exploration of the effect of the 
structure of a classification upon the diffi- 
culty of learning or remembering that classi- 
fication. The classifications were always con- 
structed by dividing eight stimuli into two 
groups of four. Moreover, each stimulus 
always took on one of two highly discrimin- 
able values on each of three dimensions. The 
structure of each classification was defined, 
therefore, in terms of the way in which the 
membership (in one of the two classes) of 
each stimulus could be specified in terms of 
the dimensions and values of the stimuli. In 


TABLE 7 


Time REQUIRED TO MEMORIZE STIMULI: 
EXPERIMENT III 
(Median number of seconds elapsing between 
presentation of stimuli and S's sorting of 
stimulus cards) 


Problem Perceptual Representation 
A B I 
Type Set (Distrib- (Distributed: 
(Compact) uted:same different 
values) values) 
TOT 416 1114 2 16 
2 5 14 
то 4 9 20 s 23 
2 9 20 2314 
ПД 1214 24 20 
2 14 25 21 
Iv 1 16 24 20% 
2 13 22 21 
Vieni 16 2215 24 
2 13 24 23% 
VI 19 2314 32 
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addition to the structure of the classification, 
however, certain other features were also 
varied. These primarily concerned (a) 
whether the task was designed to measure 
learning during successive presentation of 
stimuli (constructed from pictures of con- 
crete objects), or whether it was designed 
to measure retention after simultaneous pre- 
sentation of stimuli (constructed from ab- 
stract geometrical figures); (b) whether the 
three dimensions were confined to a single 
compact figure, or spread out over three 
spatially distributed figures; and (с) 
whether S was confronted with a problem 
for the first time, or after mastering several 
other problems of the same kind. We turn, 
now, to a consideration of some of the major 
results of these variations and their relation 
to the results of earlier investigations. 


Initial Difficulties of the Six Types 


Figure 12 summarizes the results of the 
three experiments on how the structure of a 
classification affects its initial difficulty (i.e., 
when the learning or memorization of the 
classification was not immediately preceded 
by the learning or memorization of another 
classification of the same structural type). 
As can be seen, the ranking of the six struc- 
tural types is the same in every case: 
namely, I < II < (III, IV, V) < VI 
(with III, IV, and V about equal in diffi- 
culty). This ranking apparently holds up, 
then, whether a memory or learning task is 
used; whether the stimuli are abstract or 
concrete, compact or distributed; or whether 
difficulty is measured by the time Ss take to 
study a classification or by the number of 
errors they subsequently make during recall 
of that classification. Thus the abstract 
Structure of a classification (as represented 
by the six basic types) seems to be an im- 
portant determiner of its difficulty. 

Many kinds of problems used in the study 
of concept learning resemble those originally 
investigated by Hull (1920) in that mastery 
ОЁ one of these problems can be achieved 
Simply by discovering which of the several 
variable properties of the stimuli is the one 
that determines which response will be cor- 
Tect. All such problems correspond most 


closely to what we have termed “Type I clas- 
sifications." This is true even when the rele- 
vant information is carried in a completely 
redundant fashion by more than one dimen- 
sion of the stimuli (as in the experiment by 
Bourne & Haygood, 1959), since simul- 
taneous attention to two or more dimensions 
is never required. However, other studies of 
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Fic. 12. Comparison of the difficulties of the 
six types of classifications for three different ex- 
periments and two kinds of measures of difficulty. 
(The data from Experiment Iare restricted to the 
numbers of errors made during the learning of only 
the first problem of each type in order to make the 
results from this experiment comparable with those 
from Experiments II and III. The data from Ex- 
periments II and III are averaged over the three 
perceptual representations studied in those experi- 
ments—viz, A, B, C. To facilitate comparison 
among the five sets of difficulty scores, they were 
all brought into the same range by linearly trans- 
forming each set of scores so as to have the same 
mean and variance as the other four sets of scores, 
Hence the size o£ the units and the location of the 
zero points are arbitrary for each set. Finally, the 
results for Experiment I are placed next to the 
results for Experiment ПІ so that the position of 
Type IV—not included in Experiment II—can more 
readily be compared.) р 
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concept learning, particularly those con- 
cerned with conjunctive and disjunctive con- 
cepts (eg. Bruner, Goodnow, & Austin, 
1956; Hovland & Weiss, 1953), have come 
closer to our more complicated types of clas- 
sifications. 

Of particular interest in the present con- 
nection are two other studies, one by S. L. 
Smith (1954) and one by Lise Wallach 
(in press). The three types of classifications 
investigated by Wallach corresponded to our 
Types I, II, and VI. Wallach also used two 
kinds of stimuli, For the kind that most 
closely approximated those used in the 
present experiments she obtained the same 
ranking—ie., (in our notation) I < П < 
VI. The results obtained by Wallach for her 
second—and rather different—kind of stim- 
uli will be considered later. Smith’s experi- 
ment differed from ours (and Wallach’s) in 
that the stimuli varied along four (instead 
of three) dimensions and, consequently, in 
that there were 16 (instead of 8) stimuli. 
Nevertheless, if one of the irrelevant dimen- 
sions is disregarded, three of Smith’s condi- 
tions correspond to our Types I, II, and VI. 
With this interpretation, his results were 
also consistent with ours; for again, I < II 
« VI. (However both Smith and Wallach 
stated that the differences found by them 
between the two more difficult types were 
not statistically significant.) Another condi- 
tion included by Smith differed from those 
just mentioned in that none of the four 
dimensions was irrelevant. This condition 
was the four-dimensional analog of our 
three-dimensional Type VI condition and, in 
line with our results, was still more difficult 
than the three conditions just considered. 

In contrast to all of these classification 
problems (referred to by Smith as “struc- 
tured" problems), Smith (like Metzger, 
1958) also included a classification problem 
characterized by him as "random." In this 
condition the 8 stimuli to be associated with 
one of the two responses were selected from 
the 16 stimuli at random rather than on the 
basis of the dimensions and values of the 
stimuli. However, our analysis of types of 
classifications makes clear that every classi- 
fication has some structure or other. For ex- 
ample, if our eight stimuli were divided into 


two equal classes at random, the probability 
is 0.8 that the resulting random classification 
would really be a Type III, IV, or V classi- 
fication. For, of the 70 distinct classifica- 
tions, just 56 happen to be of one of these 
three types. The classification that Smith 
labeled random was therefore probably one 
of the many four-dimensional types that are 
roughly analogous to our three-dimensional 
Types III, IV, and V. (A table of these 
four-dimensional types has been compiled by 
Moore and is presented by Higonnet & 
Grea, 1958.) Thus the word “random” can 
only be interpreted as referring to the way 
in which the classification was generated; it 
cannot properly be regarded as denoting a 
property of the classification itself. Indeed, 
for every type of classification. (including, 
therefore, any generated at random) the Ss 


in the present experiments were often able . 


to discover some kind of simplifying rule or 
regularity in the classification. Surprisingly, 
Smith's $s made even more errors on his 
random classifications than on his four- 
dimensional analogous of our Type VI clas- 
sification. 


In general, though, the present results 
agree with those of Smith (1954), Wallach 
(in press), French (1953), and others in 
showing that a classification is easier to learn 
and remember when it is related in a simple 
way to the dimensions and values of the stim- 
uli. For stimuli varying along a given num- 
ber of dimensions, the easiest classification is 
the one in which the value on a single dimen- 
sion completely determines which of the two 
classificatory responses is appropriate. The 
present results, as well as those of Smith 
and of Wallach, show that the initial diffi- 
culty of a classification monotonically in- 
creases beyond that of this easiest classifica- 
tion as the values on more and more dimen- 
sions must be taken into account." A similar 


11 Та the present experiments there is a complete 
confounding of the number of relevant and the 
number of irrelevant dimensions such that, when- 
ever one increased, the other necessarily decreased. 
This is a consequence of the facts that there were 
always the same number of variable dimensions 
(viz three) and that the dimensions were non- 
redundant in the sense that every possible combi- 
nation of values on these dimensions occurred with 
the same probability. It is only for these conditions 
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result was apparently also found in another 
unpublished experiment by Walker (referred 
to by Bourne & Haygood, 1959). On the 
other hand, the present results go beyond 
those that have previously been reported in 
that they are based on the first complete 
sampling of the possible types of classifica- 
tions. The three new types to which atten- 
tion has been called by this sampling (viz., 
IIT, IV, and V) complete the over-all pat- 
tern; they are apparently intermediate be- 
tween Types II and VI both in terms of the 
number of dimensions that must be attended 
to simultaneously and in terms of the diffi- 
culty of the classification. 


Effect of the Physical Representation of the 
Dimensions 


Although the diverse kinds of stimuli used 
in the present experiments all led to the 
same ranking of the six types of classifica- 
tions with respect to difficulty, they did have 
an appreciable effect upon the absolute level 
of difficulty of specific types. The most 
prominent difference appears to be between 
the distributed stimuli (Representations B 
and C in Experiments II and III), in which 
the three dimensions were presented as vari- 
ations in three spatially separated figures, 
and the compact stimuli ( Representation A), 
in which the three dimensions were pre- 


that we propose the above generalization : namely, 
that the difficulty of a classification increases with 
the number of relevant dimensions (and, therefore, 
decreases with the number of irrelevant dimen- 
sions), If further dimensions were added in order 
to achieve independence of the number of relevant 
and irrelevant dimensions, the expected result 
would be quite different. In particular, if the num- 
ber of nonredundant and relevant dimensions were 
held constant while further dimensions were added, 
the difficulty of a classification would presumably 
change in either of two possible ways depending 
upon the relevance and redundancy of these new 
dimensions : if the added dimensions were irrelevant 
to the classification, the difficulty should increase 
(Archer, Bourne, & Brown, 1955; Bourne & Hay- 
good, 1959); but, if the added dimensions were 
completely redundant with the original dimensions 
and therefore relevant, the difficulty should decrease 
for simple classifications (Bourne & Haygood, 
1959) and, apparently, increase for more complex 
Classifications (Bricker, 1955). 


sented as three kinds of variations in the 
same figure. The results show that the more 
difficult types of classifications (II-VI) be- 
come still more difficult when the stimuli are 
changed from the compact to the distributed 
form. Types II and VI seem to be the most 
strongly affected by such a change. Type I, 
on the other hand, is almost uninfluenced by 
this change. Indeed, what little effect there 
may be on this easiest type of classification 
appears to be in the opposite direction. 
(Note, particularly, the differences between 
Representations A and C for this type in 
Tables 6 and 7.) 

The experiment by Wallach (in press) 
seems to have some bearing on these results. 
Of the two kinds of stimuli used in her ex- 
periment, one closely resembled our Repre- 
sentation С (in Experiments II and IIT). 
The only difference was that the two alterna- 
tive figures that could occur in each of the 
three spatial positions of a stimulus were sim- 
ple nonsense figures (composed of curved 
lines) rather than conventional geometrical 
figures. As we have already observed, Wal- 
lach's results for these stimuli were consistent 
with ours. The second kind of stimuli used by 
Wallach more closely resembled our compact 
Representation A in that the curved lines 
constituting the values on each dimension 
were all combined into a single, more com- 
plex nonsense figure. However, this com- 
pact representation was quite different from 
ours in that the values on each of the three 
dimensions merged into one another in such 
a way as to lose their identities as distinct, 
perceptually isolated properties. Conse- 
quently, as Wallach remarked, these stimuli 
tended to be reacted to as unique wholes 
rather than analyzed into separate dimen- 
sions and values. For these stimuli she found 
no significant differences in the difficulties of 
the three types of classifications investigated. 
Moreover all three were significantly more 
difficult than the easiest and significantly less 
difficult than the hardest classification with 
distributed stimuli. Smith (1954) also 
found that, when the relevant dimensions of 
the stimuli were made more obscure, the dif- 
ferences in the difficulties of different classi- 
fications tended to disappear. 
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In any case the present results together 
with those of Wallach suggest the following 
generalization: As the representation of the 
dimensions in the stimuli is made more com- 
pact, the differences in the difficulties of the 
various types of classifications are de- 
creased. If the dimensions remain perceptu- 
ally distinct, this compression in the varia- 
tion in difficulty is primarily attributable to 
a disproportionate decrease in difficulty of 
the initially more difficult types. But, if the 
dimensions merge and become perceptually 
indistinct, the initially easier types become 
more difficult and, in the extreme case, all 
types of classifications approach the same 
intermediate level of difficulty. 


Transfer of Classification Learning 


-In all three of the present experiments (as 
well as in that of Smith) each S went 
through several different problems in succes- 
sion. However, the conditions of the various 
experiments differed in two respects: prob- 
lems of the same basic type were either pre- 
sented consecutively in clearly demarcated 
blocks or else intermixed at random with 
other problems of different types; training 
on each problem was either carried to a high 
level of mastery on each problem before 
proceeding to the next or else terminated 
after one exposure to the set of stimuli. 
These variations in conditions apparently in- 
fluenced the extent to which the learning of 
one problem transferred to the next. When 
a high level of mastery was not required and 
when similar problems were scattered 
throughout the entire series (as in Experi- 
ments II and III), there was no systematic 
improvement in performance over that 
series. But, when a high level of mastery 
was required (as in Experiment I and the 
earlier experiment by Smith), fewer errors 
were made on the problems toward the end 
of the series. (However, this trend was 
statistically significant only in the experi- 
ment by Smith.) Finally, when both a high 
level of mastery was required and, also, 
problems of the same type were presented 
consecutively, the positive transfer from one 
problem to another of.the same type was 
quite pronounced (as shown by the error 


curve in Figure 7 of Experiment I). These 
results appear to be consistent with the con- 
clusions of Morrisett and Hovland (1959). 
They presented evidence that, in order to 
realize positive transfer, it is not sufficient 
simply to have a wide variety of problems; it 
is also necessary to achieve a high level of 
mastery on each problem. The greater 
amount of training insured by the learning 
tasks (as opposed to the memorization 
tasks) as well as the grouping together of 
problems of the same type presumably re- 
sulted in a greater mastery of the problems 
and problem types. 

Previously reported experiments on clas- 
sification learning have not usually been spe- 
cifically concerned with transfer from one 
classification problem to another of the same 
type. Experiment I therefore provides new 
information about changes that occur in the 
ranking of the difficulties of the different 
types of classifications when several prob- 
lems of the same type are learned in succes- 
sion. The most striking finding, here, is that 
Type VI (which is initially the most diffi- 
cult) also accumulates the greatest positive 
transfer with continued practice. As a con- 
sequence, after two or three problems in 
which the stimuli are changed but the type of 
classification remains the same, Type VI be- 
comes less difficult than some of the other 
types (evidently IIT, IV, and V). This indi- 
cates that an exclusive focus on the initial 
level of difficulty of each type of classifica- 
tion can be misleading. 


THEORETICAL DISCUSSION 


We now examine some of the principles 
that have been adduced to account for phe- 
nomena of rote learning and concept forma- 
tion. The main objective will be to evaluate 
the ability of these principles to account for 
the results of the present experiments. Pre 
mary among these results is the finding that, 
when they are initially encountered, the six 
types of classifications consistently differ in 
difficulty according to the ranking I < П < 
(III, IV, V) < VI. There are also certain 
secondary results, though. In particular, 
when several problems of the same type ate. 
learned in succession, Type VI realizes by 


far the greatest within-type positive trans- 
r; and, when the relevant dimensions are 
distributed over spatially separated figures 
rather than combined as different aspects 
of a single figure), the difficult types of clas- 
sifications become still more difficult. 

"Some of the principles that have previ- 
ously been proposed pertain more to the 
mature of the individual stimuli than to the 
ructure of the classification and, so, do not 
y themselves yield definite predictions for 
the primary result of the present study (i.e., 
the ranking of the six types). It is for this 
Е reason that we omit discussion, for example, 
of Heidbreder's (1946, 1947) principle of 
"degree of "thing-character" of the stimuli. 
In connection with this particular principle, 
moreover, Baum (1954) has indicated that 
| some of the results of Heidbreder's widely- 

known studies may be derivable from a prin- 
- ciple of stimulus generalization, to which we 
now turn. 


imulus Generalization 


LÀ. number of investigators have been at- 
_ tracted by the possibility that the apparently 
| Complex phenomena of concept learning 

“might be largely understood in terms of the 
тоге elementary phenomena of rote learn- 
ing. The importance of the phenomenon of 
"Stimulus generalization has been repeatedly 
emphasized in this regard (Daum, 1954; 
Buss, 1950; French, 1953; Gibson, 1940; 
— Newman, 1956; Oseas & Underwood, 
1.1952).!* Baum’s statement of this viewpoint 
is perhaps the most incisive. From the prin- 
^ ciple of stimulus generalization, as applied to 
Verbal learning by Gibson (1940), she de- 
“duces that the difficulty of learning a classi- 
fication should increase as the stimuli that 
are assigned to different classes are chosen 
| t0 be less discriminable or as the stimuli that 
| are assigned to the same class are chosen to 
—be more discriminable. Certainly this prin- 


1? The term “generalization” has been used to 
er to various things, including an inductive in- 
ence as to what characterizes the class of stimuli 
"Which a certain response can be appropriately 
ended. However, the term is used here only in 

arrow sense of a primitive or automatic tend- 
y to confuse similar stimuli during learning. 
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ciple seems to account for the obvious fact 
that it is easier to learn to classify four dif- 
ferent horses as A’s and four different dogs 
as B’s than to classify two of the dogs and 
two of the horses as A’s and the remaining 
two horses and two dogs as B’s. For horses 
are more discriminable from dogs than are 
horses from horses or dogs from dogs. 
Moreover, if we assume that discriminability 
of two stimuli is greater when they have 
fewer properties in common, we can make 
the prediction (confirmed by our empirical 
results) that a Type VI classification will be 
more difficult to learn than a Type I. For, 
whereas the average number of properties 
shared by two stimuli that are classified to- 
gether is 1.7 for Type I classifications, it is 
only 1.0 for Type VI (see the cubical rep- 
resentations in Figure 3). However, we can- 
not make a quantitative prediction of the 
difficulty of each of the six types unless we 
have some additional information about how 
the amount of generalization between two 
stimuli depends upon the number of prop- 
erties that they have in common. 


The strong interpretation of the principle 
of stimulus generalization for classification 
learning. Central to the principle of stim- 
ulus generalization is the notion that the 
over-all difficulty of a task is compounded 
primarily from the confusions of individual 
pairs of stimuli. From this standpoint, then, 
the total number of errors made during the 
learning of a particular classification of 
stimuli should be predictable from a knowl- 
edge merely of the pair-wise confusions be- 
tween these stimuli, But just this knowledge 
can readily be obtained from experiments 
on identification learning: i.e., from experi- 
ments in which a different response is asso- 
ciated with each of the stimuli (Shepard, 
1958b). Thus, if we interpret the principle 
of stimulus generalization to mean that the 
total number of times that two stimuli will 
be confused is the same for classification 
learning as for identification learning, we 
can predict the total number of errors for 
any particular classification as follows: First, 
several Ss are trained to criterion on an 
identification task with the same set of N 
stimuli to be used in the classification task. 
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Then, the number of times each of the stim- 
uli leads to the response assigned to each of 
the other stimuli is tabulated in the appropri- 
ate cell of an N X N matrix. A number in 
any off-diagonal cell of this matrix can be 
interpreted as the number of times the cor- 
responding pair of stimuli were confused 
prior to reaching criterion. Now, during 
classification learning, the confusion of two 
stimuli that are assigned to the same re- 
sponse will not result in an overt error. 
Thus not all errors of identification will 
lead to errors of classification: In fact, in 
order to predict the total number of errors 
to be expected for any particular classifica- 
tion, one simply strikes out the numbers in 
each cell of the matrix that correspond to a 
pair of stimuli assigned to the same re- 
sponse, and sums the remaining off-diagonal 
entries. The predicted number of errors will 
in general be different for different classi- 
fications (even though the same matrix is 
used) because different entries are included 
in each sum. 

The basis for this method of prediction 
will be referred to as “the strong interpreta- 
tion of the principle of stimulus generaliza- 
tion in classification learning" in order to 
distinguish it from less quantitative formu- 
lations, such as that proposed by Baum. 
This “strong interpretation" is essentially an 
extension to classification learning of the 
mathematical formulation of generalization 
already proposed for identification learning 
by Shepard (1957). For, according to that 
earlier formulation, the confusions between 
stimuli resulting from stimulus generaliza- 
tion do not depend upon how the responses 
have been assigned to the stimuli! There 
is, however, one other implication of that 


it In the case of classification learning the fol- 
lowing assumption is probably better: Given that 
two stimuli are assigned to different responses, the 
number of confusions between those stimuli is inde- 
pendent of other aspects of the assignment, Owing 
to the absence of differential reinforcement, the 
number of confusions between two stimuli might be 
much greater if they were assigned to the same 
response than if they were assigned to different 
responses. The possibility of a much greater num- 
ber of confusions between stimuli assigned to the 
same response raises no problem for the present 
analysis, however, since such confusions are not 
observable anyway. 


formulation that should be carried over to 
the present situation: In order to minimize 
the contribution of confusions between the 
responses to the matrix obtained during 
identification learning, the identification re- 
sponses should be chosen to be as distinctive 
as possible and should be paired with the 
stimuli according to a different assignment 
for each S. 

The test of the strong interpretation be- 
comes particularly simple in cases like the 
present one, for which differences along each 
of the three dimensions of the stimuli are 
chosen to be about equally discriminable. 
We then simply determine the average num- 
ber of confusions made during identification 
learning for pairs of stimuli with two, one, 
or zero values in common along the three 
variable dimensions. These three numbers 
(designated ә», nı, and no) constitute а kind 
of gradient of generalization. (However 


this gradient differs from the usual kind in ' 


that the independent variable is number of 
common properties rather than separation 
along a single physical continuum.) Now for 
any one of the six types of classifications, 
exactly 16 of the 28 possible pairs of stimuli 
will satisfy the condition that one stimulus 
of the pair is assigned to one response and 
the other stimulus to the other response. 
From an inspection of the appropriate cube 
in Figure 3 we can determine, for each type 
of classification, how many of these 16 
between-class pairs have two, one, or zero 
properties in common. (For example, these 
three numbers are 4, 8, 4 for a Type I and 
12, 0, 4 for a Type VI classification.) We 
can then calculate the total number of con- 
fusions expected for each type of classifica- 
tion by summing the expected number of 
confusions for each between-class pair. The 
appropriate formulas are given, in terms of 
the gradient (2, 1, no) obtained from iden- 
tification learning, in Table 8. 

Method of testing the strong interpreta- 
tion. In order to gauge the extent to which 
this interpretation of the principle of stim- 
ulus generalization can account for our re- 
sults, we first estimated the average values 
of the three numbers 72, 1, and по from the 
identification problems in Experiment I and, 
then, determined whether the substitution 


= 
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TABLE 8 


FORMULAS FOR PREDICTING THE NUMBER ОЕ ERRORS 
Fon EACH ТҮРЕ OF CLASSIFICATION ON THE 
Basis ОЕ STIMULUS GENERALIZATION 


Formula for the 
predicted number of errors 


Type of 
classification 


I Anz + 8m + 4% 
II 8n2 + 8т + Ono 
ш блп + 81, + 2% 
IV бп» + бт + 4по 
V 8л» + бт + 2% 
VI 12m: + Om + 4% 


of these three numbers into the formulas in 
Table 8 yielded predictions that conformed 
with the number of errors actually made 
during the learning of classifications of each 
of the six types. To begin with, we consider 
. only predictions to the first classification to 
be learned of each type. (The case of the 
later problems, which is complicated by the 
differential within-type transfer, will be con- 
sidered later.) Therefore, since the predic- 
tion is to the first problem of each type only, 
the numbers из, nı, and no were taken from 
the first identification problem also. Un- 
fortunately, since the first identification 
problem always preceded the classification 
problems, this prediction might be system- 
atically biased in the direction of overesti- 
mating the number of errors for all types of 
classifications. However, such an overesti- 
mation should affect all types equally and, 
hence, should not interfere with the predic- 
tion of the relative spacing of the six types 
with respect to difficulty. 

Results of the test. 'The number of con- 
fusions between stimuli during identification 
learning decreased on the average as the 
number of properties they had in common 
decreased. The actual numbers, m2, nı, and 
no obtained from the first identification 
Problem were 5.03, 2.79, and 2.17, respec- 
tively. These numbers therefore conform to 
the kind of monotonically decreasing gradi- 
ent of generalization typically found in stud- 
les of generalization during identification 
learning (e.g. see Shepard, 1958a, p. 246). 
Since generalization seems to have operated 


in the expected manner in the identification 
problem, then, the conditions are appropriate 
for the test of whether the prediction to the 
classification problems is also successful. In 
Figure 13 the number of errors actually 
made on the first classification problem of 
each type is plotted against the number of 
errors predicted from the previously ob- 
tained gradient, n2, m1, no (and the formulas 
in Table 8). In contrast to the agreement 
with our expectations for identification 
learning, the prediction to classification 
learning clearly failed. The predicted num- 
bers of errors were too great for all except 
perhaps Type VI; the amount of variation 
between the predicted numbers of errors 
was strikingly smaller than the amount of 
variation between the actual numbers; and, 
finally, the predicted ranking of the diffi- 
culties of the six types was itself incorrect. 
(Note that, although I and VI were cor- 
rectly predicted to be the easiest and most 
difficult classifications, II was erroneously 
predicted to be next to VI rather than next 
to I in difficulty.) The fact that the first 
identification problem always preceded the 
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Fic. 13. Mean number of errors made when each 
type of classification was learned for the first time, 
plotted against the number of errors predicted from 
the gradient (лл) obtained during identifica- 
tion learning. (The departure of the six points 
from the 45-degree line represents a predictive 
failure of the strong interpretation of the principle 
of stimulus generalization in classification learning.) 


28 R. N. SHEPARD, C. I. HOVLAND, ann Н. M. JENKINS 


first classification problem might in part ac- 
count for the first kind of failure of the 
prediction, but it presumably could not ac- 
count for the remaining two. The results of 
the test seem clear then; the strong interpre- 
tation of the principle of generalization can- 
not by itself account for the difficulties of 
the different types of classifications. 


Tests of a weaker interpretation. Of 
course one could object that the strong inter- 
pretation of the principle of generalization is 
too stringent. In particular, one might argue 
that the gradient of generalization is not 
fixed but, rather, changes in some systematic 
way when the experiment is converted from 
one on identification learning to one on clas- 
sification learning. However, contrary to 
this argument it can be demonstrated that 
every possible gradient that might be as- 
sumed leads to an incorrect prediction. The 
demonstration proceeds as follows: First, 
since the gradient consists of just three 
numbers (иг, mı, no), any possible gradient 
is uniquely representable as a point in the 
+++ octant of the three-dimensional Eu- 
clidean space with Cartesian coordinates л», 
74, and ж. Moreover, according to the for- 
mulas in Table 8, multiplication of the three 
numbers of a gradient by the same constant 
affects the prediction only of the absolute 
number of errors, but not the relative spac- 
ing among the six types. For purposes of 
predicting only the relative spacing, then, 
we can restrict consideration to the set of 
normalized gradients for which na + ж no 
‘= 1. Each such gradient is represented by 
a point on the triangular plane with vertices 
at the points (1,0,0), (0, 1,0), and (0, 0, 1) 
as illustrated in Figure 14. The triangular 
region is partitioned in the figure to show 
what the general shape of the gradient is 
for various subregions of the total triangle. 
(For example, all monotonically decreasing 
gradients can be seen to fall in two triangu- 
lar sectors on the left of the total triangle.) 
The point of intersection of all the triangu- 
lar sectors at the center corresponds to the 
flat gradient with ns = m = m = 14. 

Now to each point in the triangular space 
of normalized gradients there corresponds 
a prediction of the relative difficulty of each 
of the six types that can be directly deter- 


mined simply by substituting the three num- 
bers for that point (m2, nı, no) into the six 
formulas in Table 8. This space can there- 
fore be systematically explored to see 
whether any gradient exists that yields the 
correct ranking of the six types. Figure 15 
summarizes the results of this exploration. 
(The triangle exhibited there is the same as 
the one illustrated in Figure 14 but, for con- 
venience, is now presented as normal to the 
line of regard.) This triangular space is 
partitioned into sectors within which the 
same ranking holds (although the relative 
spacing of the types changes continuously 
from one point to another within amy sec- 
tor). For gradients falling on any boundary 
line separating two adjacent sectors, the 
predicted ranking contains a tie. Such a tie 
is always between those types that change 
rank orders in moving from one sector to 
the other. АЙ six types are therefore tied at 
the central point of intersection of all boun- 
daries. 
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Fic. 15. Ranking of the difficulties of the six 
types of classifications predicted for every possible 
gradient of generalization. (Equations are given in 
terms of the Coordinates mz, m, and mo for the lines 
that divide the triangular space of possible gradi- 
ents into regions for which the same ranking 
holds.) 


The first thing to note is that, although 
there are (6!)2° or 23,040 possible rankings 
of the six types (including ties), only 16 + 
16 + 1 or 33 of these can be generated by 
varying the shape of the gradient of general- 
ization. There is therefore a real question 
as to whether there exists a gradient that 
will make the correct prediction. This ques- 
tion must be answered negatively, however, 
for at no point in the triangle does the rank- 
ing I < II < (III, IV, V) < VI occur. 
The only close approximation is near the 
bottom of the central vertical boundary 
where the ranking is (I, II, IIT) < (IV, 
V) < VI. But this ranking clearly departs 
from the empirical pattern exhibited in Fig- 
ure 12. Moreover it requires an implausible 
gradient in which stimuli that do not have 
‚ the same value on any of the three dimen- 
Sions are confused more frequently than 
stimuli that do have one common value (see 
Figure 14). 

. The attempt to account for performance 
in a classification task in terms of the prin- 
ciple of generalization alone also fails in 
another way. For, although this principle is 
not necessarily inconsistent with a general 
improvement in performance on successive 
Problems, it does not seem capable of ac- 
counting for any variations (from one type 
to another) in the amount of within-type 
transfer. Thus, not only the initial ranking, 


but also the subsequent shift (shown in 
Figure 6) to the different ranking with VI 
< (III, IV, V) is inexplicable on the basis 
of this principle. 

One reason for the failure of the general- 
ization principle. The fact that the strong 
interpretation of the principle of stimulus 
generalization yielded a reasonable predic- 
tion of difficulty only for Type VI (see Fig- 
ure 13) suggests that the most serious short- 
coming of the generalization theory is that 
it does not provide for a process of abstrac- 
tion (or selective attention). The argument 
runs as follows: In a Type I classification 5 
notices that the values on one of the three 
dimensions are highly correlated with the 
classificatory responses. That one dimension 
then becomes the focus of S’s attention. The 
stimuli of the between-class pairs will still 
have properties in common; but, since all of 
these shared properties are on the now un- 
attended dimensions, they will no longer 
mediate generalization to the same extent as 
in identification learning (where, in order to 
respond correctly, 5 must attend to all three 
dimensions). By abstracting the relevant 
dimension, then, S might keep the total 
number of errors in a Type I classification 
well below that predicted from the general- 
ization theory. A similar argument can be 
developed for Type II and, in a slightly 
modified form, for Types III, IV, and V. 
In Type VI, however, there is no oppor- 
tunity for abstraction in this sense; for, in 
order to respond correctly in a Type VI 
problem, S must take account of all three 
dimensions (just as in identification learn- 
ing). Figure 13 suggests that, when this 
kind of abstraction is precluded, the general- 
ization theory alone may account for the 
initial difficulty of a classification. Further 
support for this distinction between general- 
ization (or stimulus confusion) and abstrac- 
tion (or selective attention) will be presented 
when we come to the discussion of indi- 
vidual differences. We shall also argue that 
the marked positive transfer observed within 
a series of Type VI problems is evidence for 
a somewhat different kind of abstractive 
process. Meanwhile, however, we need to 
examine some notions that might be thought 
to account for the simple abstraction of—or 
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selective attention to—relevant dimensions 
of the stimuli (cf. also Binder & Feldman, 
1960, pp. 15-22). 


Conditioning of Cues 


Following the explanatory successes of 
certain theoretical notions introduced par- 
ticularly by Estes (1950, 1959b), many learn- 
ing theorists are now predisposed to regard 
a stimulus as a collection of elements or cues 
each of which can separately become condi- 
tioned to a response. The application of this 
idea to the learning of classifications seems 
at first rather straightforward. In a given 
Type I classification, for example, Response 
A might always be reinforced in the pres- 
ence of the cue black but never in the pres- 
ence of the cue white. Conversely Response 
B would always be reinforced in the presence 
of the cue white but never in the presence of 
the cue black. However both responses 
would be reinforced just half of the time in 
the presence of each of the other cues: large, 
small, triangular, circular. In this way we 
are apparently provided with an account of 
how Response A comes to be associated with 
the black stimuli and Response B with the 
white stimuli regardless of their size and 
shape. 

Furthermore, as pointed out by Bush and 
Mosteller (1951), this kind of theory might 
even subsume the principle of stimulus gen- 
eralization. In particular, since the prob- 
ability that a response will be made to a 
given stimulus is generally assumed to be 
equal to the fraction of the cues (in the 
given stimulus) that has been conditioned 
to that response, the probability of a re- 
sponse that has been conditioned to a par- 
ticular stimulus should fall off linearly for 
stimuli that have two, one, and zero proper- 
ties in common with that stimulus. And, as 
we shall indicate later, although most of the 
generalization gradients actually obtained 
from the identification condition were some- 
what concave upward, many were nearly 
linear. 

A closer examination, though, reveals that 
the performance of Ss who behaved in ac- 
cordance with this theory could never ap- 
proach the degree of accuracy that is empir- 


ically observed. In the example considered 
above (in which only the cues of color are 
relevant to the classification) the presence of 
size and shape cues, half of which are always 
conditioned to the wrong response, must in- 
fluence Ss to respond incorrectly to a sub- 
stantial fraction of the presentations, regard- 
less of how long training might be continued. 
For the same reason, Ss would never be able 
to reach criterion on an identification prob- 
lem in which the stimuli consisted of over- 
lapping collections of cues. We now consider 
two elaborations of the basic cue-condition- 
ing idea that have recently been proposed in 
attempts to correct this deficiency; namely, 
the pattern model of Estes and the adapta- 
tion model of Restle. 

Conditioning of patterns of cues. The pat- 
tern model described by Estes (1957, 1959a, 
1960) specifies that responses can become 
connected not only to the individual cues as 
independent elements of the stimulus but 
also to the total pattern of cues that uniquely 
constitutes each stimulus. Since, as noted 
above, the asymptotic performance of actual 
Ss surpasses that predicted by the original 
cue-conditioning model, Estes (1957, p. 616; 
1960, p. 60) concludes that the total patterns 
must eventually prevail in controlling the 
responses. On the other hand, there is some 
evidence that the conditioning of individual 
cues predominates during the early phases 
of learning (Estes, 1957). This may in part 
be attributable to the fact that the cues are 
more frequently available than the patterns. 
Thus in our experiments a single сие (¢8+ 
the color black) occurs on half of the pre- 
sentations, whereas a single pattern (е.в., the 
large black triangle) occurs on only an 
eighth of the presentations. In any case, if 
the responses eventually come under the 
exclusive control of the patterns, the per- 
formance of Ss will approach 100% correct 
on either identification or classification prob- 
lems (as actually observed). 

Several difficulties still remain, however: 
First, the admission that patterns of cues 
can themselves become directly connected to 
responses removes some of the appeal of the 
cue-conditioning model. The original mode 
was rather close to a description of a phys 
ically realizable mechanism. Indeed Rosen 
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blatt's “Регсерігоп” (1958) might even be 
regarded as one way of physically realizing 
the kind of general idea formalized in Estes' 
original model. The pattern model, although 
perfectly permissible as a formalism yielding 
testable predictions, seems to leave more of 
the inner mechanics unspecified. In particu- 
lar, the details of the process whereby the 
individual cues become fastened together into 
a functional unit that can be directly attached 
to a response remain mysterious. 

Of course one could argue that, since the 
pattern can enter into a unitary relation with 
a response, it is itself just another cue that 
was part of the stimulus all along. The only 
unique feature of this particular cue is that 
(unlike the others) it is not shared by any 
other stimulus. Such an argument would, 
indeed, be completely consistent with Estes" 
(1960, p. 52) definition of cue or "stimulus 
element." Unfortunately, though, it points 
up yet further problems. In order to master 
an identification or classification problem, 
the conditioning of the cue corresponding 
to the total pattern of other (component) 
cues must completely override the connec- 
tions (previously formed) between these 
component cues and the responses. But no 
specific rules seem to have been given by 
Estes for the process of eliminating these 
earlier connections. Finally, if the attainment 
0f criterion is possible only because the total 
patterns become conditioned to their appro- 
priate responses, then (even though the 
initial rate of learning might vary from one 
Classification to another) the final mastery 
of the different types of classifications would 
presumably be achieved after about the same 
number of trials. Thus, in order to account 
for the rapidity with which actual Ss reach 
criterion on a Type I classification, the pat- 
tern model (like the generalization model) 
Seems to require the annexation of an addi- 
tional mechanism for the selective suppres- 
Sion of irrelevant cues. 


Adaptation of cues. One such possible 
Suppression mechanism has been proposed 
by Restle (1955, 1957). Restle’s idea is that 
Cues that are uncorrelated with the reinforce- 
Ment of the responses become “adapted” 
and, hence, lose their control over those re- 
Sponses. Thus, in the classification problem 


considered before (in which only the color 
of the stimuli is relevant), the cues of size 
and shape would adapt out leaving the black 
and white cues in complete control of the 
responses. With this additional principle, 
then, a model based upon the independent 
conditioning of the cues of a stimulus seems 
to provide a mechanism for abstraction such 
as our discussions of stimulus generalization 
and pattern conditioning led us to seek. In- 
deed, Bourne and Restle (1959) have re- 
cently shown that a variety of phenomena 
of concept learning can be accounted for by 
a model of this kind. 

Unfortunately, in order to account for the 
mastery of an identification problem (or, in- 
deed, of a Type VI classification problem), 
certain additional complications of the model 
are necessary. For example, in an identifica- 
tion problem, since a different response 
must be associated with each of the eight 
stimuli, all confusions must eventually be 
eliminated between stimuli that differ in 
color. But, as we have seen, this is possible 
only if the cues of size and shape become 
adapted. This, in turn, would preclude the 
elimination of confusions between stimuli 
differing only in size and shape. Bourne and 
Restle, in their discussion of a four-response 
problem, apparently cope with this difficulty 
by considering, in effect, that cues do not 
become adapted absolutely but only in rela- 
tion to the pairs of responses for which those 
cues are irrelevant. Such a complication of 
the notion of adaptation seems to us to de- 
crease its attractiveness as an account of the 
phenomenon of abstraction or selective at- 
tention. Furthermore, although the model 
of Restle and Bourne specified how a cue 
that is known by S to be irrelevant becomes 
adapted, it does not specify how S comes to 
know that a cue is irrelevant. 

A general dilemma faced by cue-condi- 
tioning models. Beyond the specific objec- 
tions raised against the models of Restle and 
Bourne and of Estes, these as well as other 
models for the conditioning of cues face the 
following more general difficulty: On the one 
hand, the cues might be identified with the 
elementary physical properties of the stimuli 
—e.g., largeness, smallness, blackness, white- 
ness, etc. (This is what Bourne and Restle 
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appear to do in their discussion of two- 
response experiments.) But then we should 
have to predict that the performance of Ss 
on a Type VI classification would never im- 
prove beyond its initial chance level, for 
none of the elementary properties in this 
type of classification is by itself correlated 
with the reinforcement of either response. 
On the other hand, we might consider that 
not only the elementary properties but also 
any pattern of these elementary properties to 
which Ss can learn to respond differentially 
can serve as a cue. (This seems to be the 
original intention of Restle, 1955, pp. 11, 18; 
and of Estes, 1960, p. 52.) But under this 
interpretation we are left with the problem 
of specifying, for each possible pattern of 
cues, some parameter (e.g., a weight for that 
pattern) governing the rate at which it can 
become conditioned to a response. This, in 
turn, reduces to our original problem; 
namely, the problem of determining the diffi- 
culty of each possible classification.'* 

Thus, although a theory based upon the 
notions of conditioning and, perhaps, the 
adaptation of cues at first showed promise 
of accounting both for stimulus generaliza- 
tion and. abstraction, further investigation 
indicated that it does not, in any of the 
forms yet proposed, yield a prediction of the 
difficulty of each of our six types of classi- 
fications. Nor does this kind of theory seem 
to account for the relatively much greater 
positive transfer found in the case of Type 
VI classifications. 


Abstraction and the Formulation of Rules 


As we have just seen, the hypothesis that 
responses can be conditioned only to ele- 


28 Dattman and Israel (1951) have proposed a 
principle to account for Heidbreder's results that 
is very similar to the notion that the cue corre- 
sponding to each possible classification has a certain 
weight (or "salience"). According to them: "the 
relative ease with which concepts are attained is 
directly dependent upon the degree of perceptual 
effectiveness with which the instances serve to pre- 
sent the features to be conceptualized." This prin- 
ciple evidently suffers from the same lack of predic- 
tive force as the weighted-cue idea, since no objec- 
tive method is proposed whereby the "perceptual 
effectiveness" can be measured independently of the 
results of the concept task itself. 


mentary properties (1.е., to the properties 
that define Type I classifications) seems to 
be disconfirmed by the fact that Ss can learn 
Туре VI classifications. On the other hand, 
the hypothesis that responses can be condi- 
tioned to arbitrary combinations of elemen- 
tary properties (such as that defining a - 
Type VI classification) removes the predic- 
tive force of the conditioning models. More- 
over, the notion that such a combination of 
elementary properties can itself serve di- 
rectly as a cue seems implausible in view of 
the kinds of verbalizations actually produced 
by Ss. In describing a single stimulus, our 
Ss used words like “large,” “black,” etc. (in 
Experiment IT) or like “candle,” “trumpet,” 
etc. (in Experiment I). These clearly re- 
ferred to the elementary properties. For the 
statement that a stimulus is “black,” for ex- 
ample, is essentially equivalent to the state- 
ment that it belongs within the group of 
black stimuli that are set apart by the Type I 
classification based upon color. In no in- 
stance was an individual stimulus described 
by referring to a property that would corre- 
spond in this way to the particular group of 
stimuli set apart by, say, a Type VI classi- 
fication. Indeed, even when the stimuli are 
sorted out according to a Type VI classifica- 
tion, Ss are unable to see the four stimuli in 
either class as having any one property !h | 
common. And when (as in Experiment I 
some Ss eventually do discover a way of 
characterizing the stimuli that go together in 
such а classification, they invariably do this 
by formulating an elaborate rule in terms of 
the elementary properties. As an example of 
a relatively simple rule, they might finally 
say: "The figures on the left must be black 
and small and triangular or else have just 
one of these three [Type I] properties.” But 
even after discovering a rule of this kind, Ss 
do not then regard it as a unitary property 
of a stimulus and, surely, they would not 
subsequently invoke it in describing a single | 
stimulus—however completely. | 
Outline of an alternative to the condition- 
ing models. In view of the preceding consid- | 
erations, we are led to consider that only the | 
properties defining Type I classifications act | 
directly as cues, and that classifications other | 
than Type I can be learned only by Com - 
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structing appropriate rules for them in terms 
of Type I classifications. Accordingly, Ss 
are no longer regarded as passively con- 
fronting one population of cues after 
another while a certain crucial subset be- 
comes gradually connected to the correct 
response. Rather, they are regarded as ac- 
tively abstracting (or attending to) dimen- 
sions, and then formulating and testing rules 
about how the values on those dimensions 
combine and interact to determine which 
classificatory response will be correct. The 
development of an explicitly detailed model 
for this kind of process will not be under- 
taken here. The present study, together with 
others, might serve as a useful basis for such 
an undertaking. But the development itself 
would require a further investigation imple- 
mented, perhaps, by the new tool of com- 
puter simulation (Hovland & Hunt, 1960; 
Newell, Shaw, & Simon, 1957; Newell & 
Simon, 1959). Nevertheless the tentative 
description of the learning process, given 
here only in general outline, is sufficient to 
lead to certain expectations about the relative 
difficulties of our six types of classifications 
as well as about how these difficulties are 
influenced by certain of the conditions im- 
posed in the present experiments. 


The initial difficulties of the six types. Tf 
the foregoing description of the learning 
process is correct, the difficulty of any given 
classification should be directly related to the 
complexity of the rules required to build it 
up out of Type I classifications. That these 
tules may be largely formulated and used at 
the verbal level is suggested by the high cor- 
relations, found in both Experiments I and 
TI, between accuracy of performance (in 
Sorting or responding to stimuli) for each 
type of classification and the simplicity of 
the rules that could be stated by Ss for the 
same type of classification. Of course the 
tules verbalized by the Ss varied in detail 
from one S or occasion to another. Still, 
Since the number of possible combinations 
of values increases very rapidly with the 
number of dimensions considered, the diffi- 
culty of a classification should increase with 
the number of dimensions (or Type I classi- 
fications) that are required for the specifica- 
tion of that classification. Thus we should 


at least expect the ranking I < II < VI. 
Furthermore, the other types (III, IV, and 
V) are presumably intermediate between II 
and VI. For, although all three dimensions 
are relevant for each of these classifications, 
some (but not all eight) of the stimuli can 
be properly classified by knowing the values 
on just two of the three dimensions. 


Actually there are many different meas- 
ures that could be used to express the diffi- 
culty of building up a classification out of 
Type I classifications. We could, for ex- 
ample, use the minimum length of a logic 
expression that defines the classification by 
means of conjunctions and disjunctions of 
the elementary properties. (We already 
noted in the introduction that such an ex- 
pression is much longer for a Type VI than 
for a Type I classification.) We could also 
base it upon the number of relay contacts 
required for the physical realization of the 
Boolean function corresponding to the given 
classification (Higonnet & Grea, 1958). Or 
we could base it upon the average number of 
single-dimensional binary decisions required 
to place a randomly selected stimulus in the 
appropriate class when the sequence of de- 
cisions is made in the optimum order. But 
these (and many other) measures all agree 
with the ranking derived above in a more 
informal manner. In particular, they are all 
consistent with the ranking I < II £ III = 
IV = V < VI. The only disagreements 
concern the predictions of ties among the 
intermediate types (II through V). The 
one quantitatively defined measure that has 
seemed most satisfactory to us reflects the 
extent to which the information about the 
classification is distributed over the three 
dimensions (rather than confined to a single 
dimension—as in a Type I classification). 
In the appendix this measure is shown to 
lead to the ranking that corresponds most 
precisely with that found empirically; 
namely, I < II < (IIT, IV, V) < VI. 

Transfer of classification learning. The 
tentative description of the learning process 
in terms of abstraction and the formulation 
of rules also has some implications for the 
transfer of classification learning. In par- 
ticular, if S is faced with a new classifica- 
tion problem that, however, he has reason to 
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believe is of the same type as several pre- 
ceding problems, there is the possibility of a 
certain amount of positive transfer. If, for 
example, the problems have all been of 
Type II, S might learn to proceed directly 
to testing rules about the interaction of 
values on pairs of dimensions without wast- 
ing any time in testing useless one-dimen- 
sional rules. 

Actuall the most pronounced positive 
transfer was observed in the case of Type 
VI, and the unique degree of transfer in 
this case probably comes from the interven- 
tion of a somewhat different process. In 
particular, when S has formulated the odd- 
even rule, he has apparently performed an 
abstraction on a higher level than in the case 
of the simple abstraction of relevant dimen- 
sions. Indeed this higher level abstraction 
is most effective precisely when the abstrac- 
tion of individual dimensions would pre- 
clude solution of the problem—i.e., for Type 
VI. The effectiveness of the odd-even rule 
can be appreciated by considering that Ss 
who apply it with maximum efficiency need 
only learn the response to the first stimulus; 
each following stimulus then calls for that 
same response if and only if it has an odd 
number of properties in common with the 
first stimulus. Indeed, the one S in Experi- 
ment I who most completely mastered this 
odd-even rule (viz, S,) made altogether 
only one error on the last four Type VI 
problems. This is to be compared with the 
50 errors she made during the first Type VI 
problem (before discovering this powerful 
reductive rule). 

This kind of consideration can be used to 
predict the ranking of the asymptotic diffi- 
culties of the six types: i.e., the ranking 
that presumably would be achieved if a suf- 
ficient number of problems of the same type 
were consecutively supplied. For, if S has 
really abstracted a rule that uniquely defines 
a given type of classification, the difficulty 
of learning a new classification of that type 
should be directly determined by the frac- 
tion of the 70 possible classifications that 
corresponds to that type. Thus Type VI 
should become even easier than Type I (as 
it did for S4) because, whereas six of the 
70 possible classifications are Type I, only 


two are Type VI. If S knows that he is 
going to have a Type I problem, he knows 
that only one dimension will be relevant, but 
he does not know which of the three dimen- 
sions this is and he does not know which 
value on this dimension goes with each of 
the two responses (hence the 3 X 2 or six 
possibilities). But, if S knows that he is 
going to have a Type VI problem, his only 
uncertainty concerns which response goes 
with the first stimulus (hence the two possi- 
bilities). Now the number of classifications 
or possibilities for each of the six types (I, 
II, IIT, IV, V, and VI) are, respectively, 
6, 6, 24, 8, 24, and 2. Therefore the pre- 
dicted asymptotic ranking of these types is 
VI < (I, II) < IV < (III, V). The most 
striking difference between this ranking and 
the consistently found initial ranking is the 
change in the position of Type VI from the 
extreme of greatest difficulty to the extreme 
of least difficulty. The fact that our Ss did 
not all achieve the predicted asymptotic rank- 
ing can be taken as an indication that the five 
consecutive problems of the same type did 
not provide enough opportunity for all Ss to 
discover a rule (like the odd-even rule) that 
is sufficiently abstract to carry over to a new 
set of stimuli. However, the marked drop 
in errors over the five consecutive Type VI 
problems (Figure 6) supports this analysis 
of transfer of classification learning." 


15 Although the odd-even rule for Type VI 
greatly simplified the task of learning such a classi- 
fication, this rule is not easy for naive Ss to dis- 
cover. Most Ss reach criterion on the first few 
Type VI problems by means of much less efficient 
rules. Thus, Ss who for some reason had previously 
had considerable experience with Type VI classi- 
fications might not find Type VI so difficult ini- | 
tially. ў 

It might also be remarked here that there 15 
probably a connection, in general, between Ше 
reductive power of the best rule for а type of clas- 
sification and the number of distinct classifications 
that are of that type. The number of classifications 
of a given type has to do with the “symmetry 0 
that type: ie, with the number of transformations 
(rotations and reflections) of the cube in Figure ^ 
that do not change the type of the classification | 
(Higonnet & Grea, 1959). And symmetry seems to 
be the basis of reductive rules. For example, thé 
uniquely powerful odd-even rule for Type МН 
made possible Ьу the fact that, in that type alone, 


i 


LEARNING AND MEMORIZATION OF CLASSIFICATIONS 35 


Effect of the physical representation of 
the dimensions. We now consider why the 
spacing between the difficulties of the types 
is affected by whether the dimensions are 
represented in a single compact figure or in 
three spatially distributed figures. There 
seem to be two related possibilities: one 
emphasizing the verbal and the other the 
perceptual aspects of the classification task. 

The verbal counterpart for a stimulus in 
the compact representation (A in Experi- 
ments II and III) might be, simply, “large 
black triangle.” The counterpart for a stim- 
ulus in the distributed Representation C 
would be “large circle, white triangle, shaded 
square.” As far as a Type I classification is 
concerned, this would not be expected to 
make much difference in the complexity of 
the rule. The rules might be: “If the figure 
is black, put the card on the left” (for A) 
vs. “If there is a shaded figure, put the card 
on the left" (for C). These two rules seem 
about equally complex. On the other hand, 
the typical kinds of rules given for the more 
difficult classifications tend to be longer for 
Representation C. Thus, for Type V, the 
two rules might be: “Black figures go on the 
left and white on the right except for the 
large black triangle which goes on the right 
and the large white triangle which goes on 
the left" (for A) vs. "Those containing a 
black circle go on the left and those with a 
white circle go on the right, except that if 
there is a large triangle, black circle, and 
shaded square it goes on the right and if 
there is a large triangle, white circle, and 
shaded square it goes on the left.” This dif- 
ference in the lengths of the verbal rules 
might account, in part, for the greater diffi- 
culty of the complex classifications with the 
distributed representations. 

The other explanation is somewhat differ- 
ent. It argues that, in the case of the dis- 
tributed representations (Experiment IorB 
and C in Experience II and III), Ss can 


each of the three dimensions plays the same role. 

his is most clearly shown in Representation B of 
Experiments II and III. Here one need only know 
that either one or all three of the component figures 


‚ аге large to know that the stimulus belongs to a 


Particular class; it is not necessary to know just 
which of the three figures these are. 


directly perceive only the elementary prop- 
erties of the stimuli. As already proposed, 
then, they would have to build up each clas- 
sification on the basis of Type I classifica- 
tions alone. For the compact Representation 
A, however, it is possible that some of the 
simple interactions between dimensions are 
perceived more or less in the same way as 
elementary properties. For example, a large 
black triangle might be immediately seen as 
a black triangle without having to construct 
this fact out of the two component facts that 
it is black and that it is triangular. In the 
case of the compact representation, then, 
Estes’ notion that patterns of elementary 
cues can themselves serve as cues becomes 
rather plausible—at least for these simple 
“conjunctive” patterns. The implication for 
the present experiments would primarily be 
a decrease in the difficulties of the initially 
more difficult types when compact stimuli 
are used. For these types would no longer 
have to be built up exclusively from the 
elementary properties; they could now util- 
ize simple conjunctions of these as well. 
Thus, in a Type III classification, Ss would 
not have to learn to group all four stimuli in 
one class together (e.g., large black triangle, 
large black circle, small black circle, and 
small white circle) but, rather, they would 
only have to learn to group two kinds of 
stimuli in that class (e.g., large black figures 
and small circles). Such an explanation 
might also account for the slightly greater 
difficulty of the compact representation for 
Type I noted in the discussion of the em- 
pirical results. For, whereas the spatial 
separation of the dimensions might help Ss 
to isolate the single relevant property, the 
addition of conjunctive properties through 
the compact representation in effect con- 
fronts Ss with a greater number of prop- 
erties from which the correct one must be 
selected. This conjecture is supported by the 
relatively large number of two-factor rules 
formulated for Representation A of Type I 
(see Table 3). 

Individual differences. Finally, the dis- 
tinction that we have made between gen- 
eralization and abstraction suggests that a 
given classification could in principle be 
learned either by rote or by concept. If for 
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some reason 5 did not abstract or selectively 
attend to relevant dimensions or formulate 
reductive rules, the difficulty of a classifica- 
tion would presumably be predictable from 
the principle of generalization: i.e., from the 
confusions made during identification learn- 
ing. In this case, since identification learning 
is essentially a rote task, we might reason- 
ably assert that the classification, also, was 
learned by rote. If, on the other hand, S 
abstracted the relevant dimensions and for- 
mulated a rule specifying how the values on 
these dimensions determine the response, the 
difficulty would no longer be predictable 
from identification learning alone. In this 
case S might be said to have proceeded in a 
conceptual manner. One source of evidence 
for these notions comes from an examina- 
tion of certain differences in the perform- 
ances of individual Ss. 

Figure 16 shows again the triangular 
space of possible gradients of generalization 
previously presented in Figures 14 and 15. 
The points denoted by the circles labeled 1 
through 6 are each based upon the average 
of the gradients (72, nı, no) obtained from 
the five identification problems for each of 
the corresponding six Ss (5—5) in Ex- 
periment I. The point denoted by the circle 
labeled x is based upon the average of the 
gradients for eight additional Ss who were 
run on one identification problem each in 
order to secure further data about general- 
ization. With one exception, these average 
gradients are confined to a small region of 


MONOTONIC 


MONOTONIC / 
DECREASING/- 
Ау; 


Fic. 16. Individual differences in gradients of 
stimulus generalization. (Averages of gradients 
actually obtained during identification learning are 
plotted, for different Ss, in the triangular space of 
possible gradients.) 


| 


the total space of possible gradients and, in- 7 
deed, are monotonically decreasing and 
either concave upwards or nearly linear. 
Moreover, all 33 of the individual gradients 
upon which these six similar average gradi- 
ents are based fall in the larger shaded 
region in the left of the total triangle. For 
none of these 13 Ss did the strong interpre- 
tation of the principle of generalization pre- 
dict the obtained ranking of the types of 
classifications with respect to difficulty. 


One of the Ss (viz., S;), however, con- 
sistently differed from the others. As can be 
seen, her five individual gradients all fall | 
(along with their average) within the shaded 

D 
à 


region on the right of the total triangle. 
Strikingly, there is no overlap between the 
two shaded regions. Whereas for every 
single gradient of the other 13 55, nz > ло, 
for all five of Sys gradients, ns < no. That 
is, this S invariably confused stimuli having 
none of the three variable properties in com- 
mon more frequently than stimuli having 
two of these properties in common. Stranger 
still, this S made more errors on the “easi- 
est" type of classification (Type I) than on 
any other type. 

An examination of the rules verbalized by 
this S seems to provide an explanation for 
these puzzling reversals. Throughout the 
series of 25 problems with nonoverlapping 
stimuli, this S consistently employed a par- 
ticular recoding scheme—one used only in 
rare instances by other Ss. In applying this 
scheme, S; would begin each problem by 
arbitrarily picking two of the eight stimuli 
having no properties in common. The three 
pictures constituting one of these two am 
choring stimuli would arbitrarily be called 
Group 1 and the three pictures constituting 
the other, Group 2. Each of the remaining 
six stimuli was then specified in terms of the 
two anchoring stimuli by stating whether а 
Group 1 or Group 2 picture appeared in 
each of the three possible positions. In this 
way the eight stimuli shown in Figure 
were recoded into the eight patterns €X- 
hibited in Figure 17. The employment of 
this recoding system means that Ss neces 
sarily always took account of all three C 
mensions of each stimulus and, hence, fot 
feited the possibility of abstracting the singl* 
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d relevant dimension in the Type I classifica- 
tions. For example, the four stimuli on the 
left in Figure 17, which originally shared 
the same property (viz., the candle in the 
lower left position), no longer seem to have 
much in common. This, then, might help to 
account for the large number of errors that 
Ss made оп Type I problems. 

In addition, though, this recoding scheme 
should have a pronounced effect on the gra- 
dient of generalization. The argument runs 
as follows: an examination of the errors 
most frequently made during the learning of 
Morse code (Keller & Taubman, 1943; Plot- 
kin, 1943) reveals that errors in which a 
single element is mistaken (e.g. a dot is 
taken for a dash) are less common than 
errors in which the entire pattern of dots 


Fic. 17. The eight stimuli of Figure 4 recoded 
according to the system used by Ss. (The classi- 
fication shown here is the same as that shown in 

. Figure 4; but the order of the four stimuli on ће 
. Tight has been changed so that the stimuli that are 
l most similar after the recoding are horizontally 
adjacent.) 


and dashes is transformed as a whole. The 
most frequent errors, indeed, seem to involve 
reversals (in which the temporal pattern is 
confused with the same pattern taken in 
reverse order—e.g., ——* for *——) or comple- 
mentations (in which all the dashes are con- 
verted to dots and vice versa—e.g., ——— for 
+++). Similarly, then, the most frequent 
confusions to be expected between the pat- 
terns in Figure 17 are not those that are 
produced by altering a single element (e.g., 
by changing a 1 to a 2) but those that are 
produced by transforming the pattern as a 
whole. In particular, confusions should be 
most common in which every 1 is converted 
into a 2 and every 2 into a 1 and, perhaps 
to some extent, in which the pattern suffers 
a left-right reversal. Now, if each pattern in 
Figure 17 is most frequently confused with 
its complement (i.e., the one that is adjacent 
to it on the right or left), we are provided 
with an explanation for the unusual gradient 
in which ло > nz; for just these pairs of 
patterns correspond to the pairs of stimuli 
that have no properties in common. By ex- 
tending this argument in detail, a plausible 
case can be made for the statement that Ss 
who used this recoding scheme should also 
have ж, > m = m: ie, their gradients 
should fall on the line connecting the center 
of the triangle and the lower right corner. 
As can be seen in Figure 16, most of the 
gradients obtained from S; do in fact fall in 
the vicinity of this line.*° 

Now, since S;’s coding of the stimuli pre- 
vented her from abstracting the relevant di- 
mensions, we should expect the strong inter- 


16 As already mentioned, we have used “general- 
ization” to refer to confusions between stimuli. 
But, in accordance with the above discussion, we 
should now acknowledge further that these confu- 
sions might arise from at least two possible 
sources: They might be the result of “primary” 
stimulus generalization based upon the similarity of 
the stimuli in terms of their physical properties 
(eg., the number of such properties that they have 
in common). This assumption seems to be con- 
sistent with the gradients that we observed for all 
Ss except Ss. Or, on the other hand, these confu- 
sions might be the result of "mediated" generaliza- 
tion based upon the similarites of the implicit 
(recoding) responses made to these stimuli by a 
particular S. This is what we have proposed for 


Ss. 
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pretation of the principle of generalization 
to predict the initial difficulties of the differ- 
ent types of classifications. Of the four 
types of classifications administered to Ss, 
the ranking with respect to difficulty was 
found to be (V, П) < VI < I. (The 
number of errors on the first problem of 
each of these types was, respectively, 28, 29, 
38, and 55.) The predicted ranking for the 
triangular subregion containing S,’s average 
gradient (and most of the surrounding 
shaded area) is II < V < VI <I. The 
only discrepancy between these two rankings 
is in the case of II and V. And this discrep- 
ancy may be attributable to the fact that Ss 
had Type II first (see Table 1); for, as 
shown in Figure 5, Ss tended to make more 
errors on the very first classification prob- 
lem. The rough agreement between the pre- 
dicted and the obtained ranking of problem 
difficulties for this S is emphasized by the 
fact that they both depart strikingly from 
the predicted and obtained rankings for all 
other Ss. 


SUMMARY 


A combined empirical and theoretical in- 
vestigation of the difficulties of different 
kinds of classifications was undertaken using 
both learning and memory tasks. Sets of 
stimuli of a variety of kinds were used but, 
in each set, there were eight stimuli each of 
which took on one of two possible values on 
each of three different dimensions. For ex- 
ample, in one set, each stimulus was large or 
small, black or white, and triangular or 
circular. The classifications to be learned or 
remembered were always set up by assigning 
four of the eight stimuli to one class and the 
remaining four stimuli to the other class. 
Three kinds of procedures were used. In 
one, Ss learned to associate one of two 
classificatory responses (e.g., A or B) with 
each of the eight stimuli by means of a 
method of successive presentation and re- 
sponse correction (the usual paired-associate 
procedure). In the two other procedures, Ss 
were presented with a simultaneous array of 
the eight stimuli already grouped into the 
two classes and then, after the removal of 
the array, were tested for their ability either 


to sort the stimuli into the same two classes 
or else to state a concise rule specifying how 
the stimuli were classified in terms of their 
dimensions and values. These procedures 
were used to determine the difficulties of all 
possible types of classifications of the stimuli 
into two groups of four. In addition, trans- 
fer of classification learning and the effect 
of representing the dimensions of the stimuli 
in different ways were also investigated. 
Finally, various mechanisms that have been 


-proposed to account for phenomena of rote 4 


learning and concept formation were evalu- 
ated in relation to the empirical results. The 
following conclusions were drawn: 


1. Of the 70 possible classifications of the 
eight stimuli into two equal groups, there 
are only six basic types. The different clas- 
sifications belonging to any one of these 
types have the same structure; they differ 
only with respect to which of the three di- 
mensions is assigned to which of the three 
roles in the classification, and with respect 
to which of the two classificatory responses 
is assigned to which group of four stimuli. 
"These six types we denoted by the roman 
numerals I-VI. For the purposes of the 
classification, only one dimension is relevant 
for Type I, two for Type II, and all three 
for Types III-VI. These last four types 
differ, however, in the ways the values on 
the three dimensions interact in defining the 
classification. 


2. When classifications of these six types 
are encountered for the first time, they con- 
sistently differ in difficulty according to the 
ranking I < II < (Ш, IV, V) < VI 
(with TIT, IV, and V about equal in diffi 
culty). The same ranking is found for ^ 
learning and memory tasks, inspection time 
and error scores, and a variety of different 
kinds of stimuli. 


3. When several classifications of the 
same type (but each using a different set of 
stimuli) are learned in succession, the dif- 
ferent types of classifications accumulate 
differing amounts of positive transfer. As a 
consequence the initial ranking of the diffi- 
culties of the six types changes so that, after 
several consecutive problems of the same 
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* type, Type VI (which is initially the most 
difficult) becomes easier than some of the 
other types. 


4. When the stimuli are changed so that 
the three dimensions are represented as vari- 
ations in three spatially separated figures 
rather than as three kinds of variations in 
a single compact figure, the ranking remains 
the same but the spacing between the diffi- 
culties increases. In particular, the more 
difficult types of classifications (II-VI) be- 
come still more difficult, but the easiest 
(Type I) remains about the same or even 
decreases slightly in difficulty. 


5. A high correlation exists, over the 
various conditions of the experiments, be- 
tween the performance measures (in terms 
of time or error scores) and the simplicity 
of the verbal rules that Ss can formulate to 
describe the classifications. 


6. The cue-conditioning models, including 
the recent "pattern" and "adaption" models, 
do not seem to yield predictions of the diffi- 
culties of the six types of classifications. 


7. The principle of stimulus generaliza- 
tion, on the other hand, can be interpreted 
in a form that does yield testable predictions 
for the learning of classifications. The pre- 
dictions generated by the particular inter- 
pretation attempted here, however, do not 
agree with the empirically determined diffi- 
culties of the six types. Apparently the 
principle of stimulus generalization does not 
by itself provide an account for the fact that 
Ss can abstract the relevant from the irrele- 
vant dimensions of the stimuli. 


8. The results suggest that, in addition to 
abstracting the relevant dimensions, Ss learn 
any given classification by formulating a 
rule for building that classification up out of 
Type I classifications. This tentative charac- 
terization of the learning process seems to 
provide a basis for understanding the ob- 
tained ranking of the six types of classifica- 
tions with respect to initial difficulty, the 
markedly greater positive transfer in Type 
VI classifications, the effect upon difficulty 
of using compact or distributed stimuli, and 
certain individual differences. 
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APPENDIX 


The purpose of this section is to show how cer- 
tain information-theoretic considerations lead in a 
rather natural way to the obtained ranking of the 


six types of classifications with respect to initial’ 


difficulty. The variables x, y, and z will be used to 
represent the three dimensions of the stimuli: e.g., 
size, color, and shape. Thus the variable x might 
take on the two values: large and small. In addi- 
tion, a classification variable, C, can be defined to 
take on one value (e.g., A) for all stimuli in one 
class and another (B) for all stimuli in the other 
class. For all the classifications considered here, 
our initial uncertainty about the value of the classi- 
fication variable for a randomly selected stimulus 
is just one bit. If we know exactly what classifica- 
tion is involved, a complete specification of the 
stimulus (ie, a statement of its size, color, and 
shape) will reduce our uncertainty about the value 
of the classification variable from one bit to zero. 
This total reduction in uncertainty can be parti- 
tioned into three components: C(x), the reduction 
due to the specification of size alone; С.(у), the 
reduction due to the additional specification of color 
(over and above that due to the specification of size 
alone) ; and Czy(2), the reduction due to the addi- 
tional and final specification of shape (over and 
above that due to the specification of size and color 
together). Since the total reduction is one bit: 
C(x) + С.(у) + Cs (2) 51 

This equation shows, in one way, the extent to 
which the information about the classification vari- 
able is spread out among the three dimensions of 
the stimuli. But it is arbitrary in that the three 
variables x, y, and 2 could be taken in any of their 
3! or 6 possible orders. And, except for classifica- 
tions of Types IV and VI, the magnitudes of the 
three terms will depend upon the order in which 
the variables are taken. However, we shall suppose 
that the variables are taken in their optimum order : 
ie, the order for which the one-variable term is as 
large as possible and the three-variable term is as 
small as possible. Assuming, then, that the vari- 
ables are taken in this order, we shall rewrite the 
equation in the simpler form: 

С, + 66-1 
The magnitudes of the one-, two-, and three- 
variable terms can readily be calculated for each of 
the six types of classifications (see McGill, 1954). 
They are 10,0 for Type I; 0,1,0 for Type Il; 
0,0,1 for Type VI; and 0.189, 0.311, 0.500 for each 
of the three remaining types, IIT, IV, and V. 

Now the difficulty of extracting information pre- 
sumably increases with the number of variables to 
which Ss must simultaneously attend in order to 
extract that information. Therefore the difficulty 
of a classification should be greater if a larger 
fraction of the information about the classification 
is contained in the two-variable and, particularly, 
the three-variable terms. An index of difficulty, D, 
could therefore be defined by weighting each term 
according to the number of variables involved in 


that term. This can be done in a general way by 
writing : 
Cit aC; + BC; = D 

If B = a = 1, this reduces to the previous equation 
and the resulting index is unity for all six types of 
classifications. But, if 8 > a > 1, the resulting 
index ranks the six types—presumably with respect 
to difficulty. 

For any particular choice of values for a and 8 
we can determine the ranking implied by those 
values by substituting them (together with the 
numerical values already determined for Ci, Cs 
and Сз) into the expression for the difficulty, D. 
Figure Al shows the result of a systematic ex- 
ploration of the rankings implied by every pair of 
coefficients, a and B, for which 8 > а > 1. As сап 
be seen, only three solutions are possible: if 8 < 
1.378 а — 0.378, then I < (III, IV, V) < I < 
VI; if 8 — 1378 a — 0.378, then I « (II, III, IV, 
V) < VI; and if 8 > 1.378 а — 0.378, then I < 
II < (III, IV, V) < VI. Moreover, since the 
number of possible configurations of values along 
the dimensions of the stimuli increases exponen- 
tially with the number of dimensions, the increment 
in difficulty that results from attending to three 
dimensions instead of two is probably at least as 
great as the increment that results from attending 
to two dimension instead of one. Therefore we 
should expect that (8—a) = (a—1) and, a fortiori, 
that В > 0.378 a — 0.378. Hence, we again arrive 
at the ranking I < II < (III, IV, V) < VI. 
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Fic, Al. Space of possible coefficients, а and £, 
for the measure, D, of the difficulty of a classifica- 
tion. (The indicated point for a = 2 and B = 3 
corresponds to the case in which each term in the 
expression for D is weighted by the number О 
variables involved in that term. This point might ; 
be considered to represent a reasonable choice of f 
values for a and В.) 
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ABILITIES AND LEARNING SETS 
IN KNOWLEDGE ACQUISITION' 


ROBERT M. GAGNÉ 
Princeton University 


HERE is considerable current interest 
8 the techniques of programed learn- 
£ ing, as well as in the experimental investi- 
gation of the nature of behavioral processes 
involved in their use (Lumsdaine & Glaser, 
1960). It has been proposed (Gagné, in 
press) that the learning that transpires in 
the course of administration of learning 
programs may be called productive learning, 
and distinguished in operational terms from 
* the reproductive learning of the typical 
Я verbal learning experiment. Whereas in the 
latter type, performance is usually measured 
on the specific task that is practiced during 
learning, in the former, measures of terminal 
performance are designed to demonstrate 
mastery of the defined class of tasks. For 
example, learning programs are intended to 
establish proficiency in such classes of tasks 
as "adding binary numbers," or "solving 
Y linear equations," and the specific tasks em- 
ployed to measure such performance are 
considered merely representative of the total 
class. 
Individual differences, naturally enough, 
аге prominently observable in productive 
learning situations, perhaps even more than 


———— 
1This study results in part from a collaborative 
^ effort between a project on mathematical concept 
{ learning, directed by the senior author at Princeton 
University, and the University of Maryland Mathe- 
matics Project, with which both authors have 
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they are in verbal learning studies of the 
reproductive sort. It is not uncommon, for 
example, to find that fast learners complete 
a learning program in half the time of slow 
learners. This fact has sometimes been 
cited (Skinner, 1958) as an advantage to the 
use of programed learning, contrasted to the 
usual classroom learning, in that individuals 
are allowed to proceed at their own pace 
depending upon their initial ability. How- 
ever, there appears to be little if any current 
evidence about the nature of these individ- 
ual differences in completion of learning 
programs, beyond the fact that they occur. 

One hypothesis about the origin of differ- 
ences in rate of completion of a learning 
program is that rate of acquisition of the 
successive items (frames) of the program 
is basically determined by an ability which 
may be called "general intelligence." Testing 
this hypothesis would presumably involve 
the partialing out (experimentally or other- 
wise) of other factors which might affect 
speed of program completion, such as “read- 
ing speed" and “speed set." This would by 
no means be easy to do. Besides this such 
a hypothesis about general intelligence tends 
to restore an older, original meaning to the 
latter phrase, that is, "learning rate" ability. 
Undoubtedly such a restoration would be 
welcome to many psychologists, if verifica- 
tion could be obtained for it. However, it 
would also run contrary to the general trend 
of results obtained over a period of years, 
most of which have failed to obtain evidence 
of a general factor which might be called 


2 Тһіѕ was the finding in an unpublished study 
which utilized the learning program on equation 
solving described herein—GacNÉ, К. M., & Dicx, 
W. Learning measures in a self-instructional pro- 
gram in solving equations. 
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learning rate ability that is common to 
achievement in a variety of different learn- 
ing situations (cf. Woodrow, 1946). 

An alternative hypothesis to account for 
individual differences in rate of completion 
of, and achievement in, learning programs 
has been proposed (Gagné, in press). This 
is to the effect that such observed differences 
result primarily from the fact that individ- 
uals begin the task of learning with different 
amounts and kinds of knowledge. Knowl- 
edge relevant to any given final task to be 
learned is conceived as a set of subordinate 
capabilities called learning sets (cf. Harlow, 
1949). "These are considered to be arranged 
in a hierarchy such that any learning set 
may have one or more learning sets sub- 
ordinate to it in the sense that they mediate 
positive transfer to the given learning set. 
These subordinate sets in turn have other 
learning sets subordinate to them, and so on. 

Each learning set in the hierarchy is rep- 
resented by a distinct class of tasks, and 
measured in the individual by one or more 
representative tasks from this class. In 
order for learning to occur at any point in 
the hierarchy, according to this theory, each 
of the learning sets subordinate to a given 
task must be highly recallable, and inte- 
grated by a thinking process into the solu- 
tion of the problem posed by the task. The 
attainment of the final task is thus conceived 
to be a matter of successive attainment and 
"integration" of a series of lower level 
learning sets, beginning with those which are 
already available to the individual. Failure 
to achieve the final task may have several 
causes: a subordinate learning set may have 
been omitted from the program, and there- 
fore could not have been acquired; insuffi- 
cient practice (or other condition) may have 
resulted in low recallability of one or more 
subordinate learning sets; or the program 
may have been unsuccessful in inducing the 
integration necessary for the attainment of 
any given learning set in the hierarchy. 

The hierarchy of learning sets which sup- 
ports any given final task may be defined 
by an analysis procedure. The question is 
asked of the task, and successively of each 
task thereby defined: “What would the in- 
dividual have to know how to do in order 


to be able to achieve this (new) task, when 
given only instructions?" The answer to 
this question defines one or more subordi- 
nate learning sets, and each of these in turn 
may be seen to be dependent upon one or 
more subordinate learning sets, until the 
entire hierarchy is defined (see Figure 1 
below). At the bottom of this hierarchy are 
Some learning sets which are very simple 
and very general indeed. Although they are 
arrived at in the manner described, they 
seem to resemble, as a class, those tasks 
presented by tests used to measure certain 
very general abilities which have been identi- 
fied by factor analysis techniques. For ex- 
ample, a learning set hierarchy for the task 
of adding dissimilar fractions may be found 
to contain at its lowest level such learning 
sets as “adding two-digit numbers," “multi- 
plying two-digit by one-digit numbers;" and 


"recognizing symbols.” These appear to be \ 


the tasks represented by tests of Number 
Ability and Symbol Recognition (a variety 
of Associative Memory) (French, 1954). 
We shall examine additional implications of 
this finding in a moment. 

In accordance with this conception of the 
learning set hierarchy, we can begin to ex- 
plore an expected set of correlates (predic- 
tor variables) for the individual differences 
to be observed in connection with the ad- 
ministration of a learning program. First of 
all, individuals would be expected to begin 
a learning program with different numbers 
and kinds of learning sets. Many of these 
Will of course be irrelevant to the task, that 
is, they will not have been identified as 
members of the hierarchy derived in the 
manner previously described. More im- 
portantly, however, individuals will differ 
in the pattern of learning sets they possess 
within the hierarchy of relevant learning 
Sets. Some may possess only a few, all at 
the lowest level; others may possess nearly 
all that are required to achieve the final per- 
formance. Still others may display rather 
uneven patterns, indicating "gaps" in their 
previously acquired knowledge. The number 
and kind of learning sets that the learner 
brings to the learning program situation 
may be expected to determine how rapidly 
he completes it, other things being equal. 


b. 


| 
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x For an individual who already has many of 
` the relevant learning sets, responding to the 
frames of a learning program will be largely 
a matter of review, of traveling over familiar 
ground. For a person with only a few low 
level learning sets to begin with, the attain- 
ment of each new one may be expected to 
take time greater than is needed for “re- 
view"; and more of these will need to be 
acquired in order to attain successful per- 
„ formance on the final task. 

A second source of individual differences 
has its origin in the fact that different indi- 
viduals approach the learning with different 
patterns of "basic abilities," which may turn 
out to be those factors which appear con- 
sistently and dependably in factor studies 
(French, 1954). Some of these basic abili- 
ties will be related to the learning sets which 
must be acquired, in the sense that they will 

1 have been identified as the “bottom row" of 
the learning set hierarchy arrived at by 
means of the analysis just described; others 
will be not so related to the hierarchy. In 
order to have a distinctive set of terms, we 
may call the former "relevant" basic abilities 
and the latter "irrelevant." In the sense we 
wish to use the term, relevant means related 
by theoretical prediction. 

In summary then, the suggestion derived 

f. from this theory is that differences in rate of 
completion of a learning program are pri- 
marily dependent upon the number and kind 
of learning sets (i.e., the "knowledge") the 
learner brings to the situation, secondarily 
upon his standing in respect to certain rele- 
vant basic abilities, and not in any direct 
sense upon a general "learning rate" ability. 


^ Statement of the Problem 


As is evident from our previous state- 
ments, we recognize the possibilities of in- 
dividual differences at the beginning of a 
learning program to be: (a) differences in 
knowledge, i.e., the number and pattern of 
learning sets, both relevant and irrelevant 
to the final performance; (b) differences 
in amounts of basic abilities, relevant and 

irrelevant; and (c) differences in a general 
learning ability, general intelligence. Looked 
at as a whole, the problem may be stated 


in the following way: what are the causes 
of individual differences in performance on 
a learning program, and what relative 
weighting of causal effect can be assigned 
to each of them? The implication of the 
theory previously stated is that a substantial 
proportion of the variance in learning pro- 
gram performance is attributable to the at- 
tainment or nonattainment of learning sets 
relevant to the final task which the program 
is designed to teach. 

In more specific terms, the theory to- 
gether with assumptions derived from other 
psychological findings, would predict the 
following things about individual differences 
in programed learning (or, more generally, 
in productive learning) : 

1. Individual differences in those begin- 
ning a learning program may be independ- 
ently measured as differences in (a) general 
intelligence, (5) relevant basic abilities, (c) 
number and pattern of relevant learning 
sets. The word “relevant” means “rationally 
derived as related" in accordance with the 
analysis procedure previously outlined, 


2. An ideally effective learning program 
has the effect of reducing the variance at- 
tributable to 1c to zero, since in such a pro- 
gram all learning sets are attained by every- 
one. To the extent that a learning program 
is ineffective, however, an increasing num- 
ber of individuals will be "eliminated" from 
attainment of learning sets at progressively 
higher levels of the hierarchy (and accord- 
ingly, from attainment of the final task). 


3. Both Factors 1b and 1с are considered 
to mediate specific, rather than general, posi- 
tive transfer to the learning of relevant 
learning sets in the hierarchy. Positive trans- 
fer in this situation may be measured by 
rate of learning, that is, by the time taken 
by the learner to attain any or all of the 
relevant learning sets including the final task. 
Accordingly, as the learner progresses up- 
wards in the hierarchy, his rate of learning 
should depend increasingly on the attain- 
ment or nonattainment of relevant learning 
sets, and decreasingly on relevant abilities. 
Specifically, this means that the rate of 
learning of learning sets will correlate to a 
decreasing extent with relevant abilities as 
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one progresses to higher and higher learning 
sets. Since the same progression partakes 
increasingly of new learning (as opposed to 
recall of previously acquired learning sets), 
there should also be increasing correlation 
of rate of learning with attainment of im- 
mediately subordinate learning sets. 


4. The effects of Factor la, general in- 
telligence, should be apparent in moderately 
low correlations of learning rate with meas- 
ures of this factor. However, the amount of 
"general" transfer will be expected to remain 
constant as the learner progresses upwards 
in the hierarchy, as exhibited by a constant 
Size of correlation coefficient with learning 
rate at all levels of the hierarchy. The same 
pattern of correlation (i.e. a constant опе) 
would be expected between learning rates 
and irrelevant basic abilities, presumably be- 
cause of the extent to which these too 
sample general intelligence and mediate gen- 
eral transfer. 


5. The relations of basic and general abili- 
ties to achievement of learning sets in the 
hierarchy, as opposed to learning rate, ap- 
pear somewhat more difficult to predict. 
Ву achievement here is meant being success- 
ful or unsuccessful on the task which 
represents a learning set, more or less in- 
dependently of time (within reasonable 
limits). Primarily because of the reasoning 
outlined in Paragraph 2, more and more 
individuals would be expected to "drop out" 
at higher levels of the hierarchy, to the ex- 
tent that the learning program is ineffective. 
As this happens, the "selected" group will 
presumably contain increasing numbers of 
individuals who score high on relevant basic 
abilities (and perhaps to a lesser extent, 
high on general intelligence). Consequently, 
correlations of relevant basic abilities with 
achievement of progressively higher learning 
sets should exhibit an increasing pattern, in 
a moderately ineffective learning program. 
In a sense, then, the rate and amount of such 
increasing correlation may be looked upon 
as an inverse measure of the effectiveness 
of a learning program, 


It was our intention to test as many as 


possible of these predictions from the theory 
of learning set hierarchies. To do this, we 


first analyzed a final task represented by an V 


existing learning program in solving linear 
algebraic equations, to define a hierarchy of 
learning sets. Before administering the 
learning program to a group of seventh 
grade children, we obtained measures of 
those basic abilities which were revealed by 
this analysis to be relevant, as well as of 
two which were irrelevant. During the ad- 
ministration of the program, records were 
obtained which provided measures of rate „ 
of learning of each learning set in the hier- 
archy. Following this, measures were made 
of achievement of each learning set, per- 
formance of the final task of equation 
solving, and transfer indicated by perform- 
ance in solving equations of unfamiliar form 
and content. Altogether, these measures en- 
abled us to obtain evidence on transfer of 
training within the determined hierarchy of 
learning sets, and on the relation of both 
basic abilities and learning sets (knowledge) 
to rate of learning, achievement, and trans- 
fer within the area defined by the learning 
program. 


METHOD 
Analysis of the Task 


The final task for which learning was 
intended was solving simple linear algebraic ^v 
equations, either for numerical values of a 
Stated variable, or for expressions of a 
stated variable in terms of other variables. 
Examples of these tasks may be found in 
Appendix B. 

Analysis of this task was begun by for- 
mulating an answer to the question: “What 
would an individual have to know how to 
do in order to achieve successful perform- 
ance of this class of task, assuming he were 
given only instructions?" The phrase re- 
garding instructions needs some explanation 
(Gagné, in press), which may be restated 
here briefly. The individual must be: told 
the form of the answer (in this case, whether 
numerical or symbolic); informed of any 
definitions of stimuli required; and provided 
with guidance suggesting the application of . 
previously acquired learning sets to a new 
task. 


d 
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Assuming this kind of instruction, the 
answer to the question "What would the 
individual have to know ?" turned out to be 
three distinct learning sets (tasks). The 
first was "simplifying an equation by adding 
and subtracting terms”; the second was 
“simplifying an equation by multiplying, di- 
viding, adding, or subtracting arithmetic 
numbers”; and the third was “simplifying 
an equation by multiplying and dividing by 
terms.” The concrete meanings of these de- 


^^ scriptions may be grasped by examining the 


l 


i 


tasks used to measure them, listed as I1, I2, 
and I3 in Appendix A. What they repre- 
sent are three types of problem situations 
which the solver of equations meets in the 
course of arriving at a solution to equations 
of the total class. (Of course, there are 
instances in which the simplifications de- 
scribed would actually lead to a solution, but 
these are too simple by themselves to repre- 
sent the total class of linear equation prob- 
lems.) Each of these tasks is distinct: that 
is, it is possible to know how to do any one 
without knowing how to do the others. Each, 
as will become apparent, is based upon dif- 
ferent patterns of subordinate knowledge. 
This analysis was then repeated on each 
of the three learning sets defined as subordi- 
nate to the task (Level I), until the entire 
hierarchy was defined as shown in Figure 1. 
In this figure, each block contains a descrip- 
tion of a learning set. The large dots under 
each block indicate which of the lines leading 
from subordinate learning sets is “tied into” 
the learning set defined in that block. Some- 
times, a subordinate set several levels down 
the hierarchy is related to a learning set in 
this manner; in such instances, the line de- 


_“Picting the relationship may pass under- 


neath an intervening block, and the absence 
of a dot indicates no “tie-in” with this in- 
termediate learning set. The learning sets 
are arranged into levels indicated by roman 
numerals, primarily for the purpose of de- 
notation. No particular significance is at- 
tached to these "levels," except that the 
learning sets on any given level appear to 
be approximately equivalent in complexity. 


3 Tn referring to a particular learning set, one 


| may use the roman numeral to designate 


level and arabic numerals to denote its hori- 


zontal position beginning with 1 at the left. 
Thus, the three learning sets previously de- 
scribed are I1, I2, and I3. 

In carrying out the analysis, we were of 
course guided by the outline of the learning 
program which was to be used in the study. 
This does not mean, however, that we de- 
rived the learning sets in the hierarchy 
directly from the program—quite the con- 
trary was the case, and we shall point out 
later that our method of analysis revealed 
two learning sets which were inadequately 
represented in the program. What it does 
mean is that in doing the analysis we ac- 
cepted in general the approach to solution of 
simple equations which had been designed 
into the program. This particular approach 
is by no means the only one. In other 
words, there are perhaps several possible 
learning set hierarchies which could be 
worked out to support this final task, and 
it is quite conceivable that some are “better” 
than others in the sense of being more effi- 
cient or more transferable to later learning. 
The proponents of instruction in "modern 
mathematics" would almost surely not favor 
this particular one. 

As the theory foresees, the successive ap- 
plication of the analysis procedure defines 
learning sets which are increasingly simple 
and increasingly general in the sense that 
they are potentially supportive of greater 
numbers of superordinate learning sets. A 
learning set like IITA2 (simplifying frac- 
tional expressions), for example, must sup- 
port quite a large number of different tasks 
in the field of arithmetic. The learning sets 
at Level IVA are seen to be almost as simple 
as can be defined, for a human being. The 
concrete form of each learning set may be 
understood by referring to Appendix A. 

Somewhat parenthetically, it needs to be 
stated that the descriptions of learning sets 
employed in Figure 1 employ somewhat old- 
fashioned language, judged against the stand- 
ards of modern algebra texts. The language 
of the learning program was similarly old- 
fashioned. The validity of the procedure, 
however, should not be questioned on this 
account. When the learning sets are under- 
stood as tasks, which can be done by ex- 
amining the problems given in Appendix A, 


the ambiguities of language are largely ге- 
moved, and some intuitive grasp of the 
relatedness of the various learning sets in 
the hierarchy may be gained. It may also be 
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emphasized that our purpose in this study 
was to test a method and some theoretical 
deductions which presumably are applicable 
to any task and any learning program. We 
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were not directly interested in the effective- 
ness of the particular program we employed, 
nor in the modernity of its content. 

Level 5 of the hierarchy defines tasks 
which we hypothesize to be equivalent to 
those occurring in "factor reference" tests 
(French, 1954). Number is the name we 
use for tasks requiring the arithmetic opera- 
tions of addition, subtraction, multiplication, 
and division of one- and two-place numbers. 
Symbol Recognition is a variety of Associa- 
tive Memory usually measured by such a test 
as Picture-Number (French, 1954, p. 15). 
Beyond these two familiar factors, we found 
it necessary to identify another which is not 
so well known, called Integration I (Guil- 
ford & Lacey, 1947; Lucas & French, 1953). 
In its simplest form, it appears to be the 
capability of “holding in mind" several op- 
erations in sequence, and is often measured 
by tests of following directions. Тї should 
be emphasized that these capabilities (the 
learning sets at Level V) were derived by 
the same analysis procedure as were others 
in the hierarchy. We did not set out to find 
tasks which resemble those used in measur- 
ing basic factors. Instead they represent the 
simplest kinds of things an individual must 
"know how to do" in order to progress up 
through the hierarchy. Having identified 
them, however, it seemed important to gather 
evidence which would relate them as basic 
abilities to performances of higher level 
tasks. 


Materials 


Tests of basic abilities. To obtain measures of 
basic abilities, we used factor reference tests sug- 
gested by French (1954). Number Ability was 
measured by the two tests Addition, and Subtrac- 
tion and Multiplication. Symbol Recognition ability 
(Associative Memory) was measured by means of 
the test Picture-Number. Integration was meas- 
ured by a test called Following Directions. In 
addition, to measure irrelevant abilities, we used 
two which would provide a rather severe test of 
our hypothesis, Verbal Knowledge (measured with 
a test called Vocabulary, У-1) and Speed of 
Symbol Discrimination (Letter A).* 


з Supplied by John W. French, Educational 
Testing Service, and used with permission. 

4Used with the permission of the copyright 
owner, Thelma G. Thurstone. 


Learning program. The learning program de- 
signed to teach the solving of linear equations was 
originally composed of 247 frames. These were 
printed on 4"X 6" cards, and were divided into 
eight convenient and roughly equal sequences, 
which were assembled into booklets with plastic 
hinges at the top. The new frames in each of the 
Booklets 2 through 8 were preceded by several 
review frames (5 to 17 in number) identical to 
certain frames in the previous booklet. Thus each 
booklet contained from 36 to 45 frames. The 
eight booklets were designed to be used in eight 
class periods over eight successive school days. The 
answer to each item on the front of the card was 
printed on the back. Mimeographed answer sheets, 
with blanks numbered to correspond to frame num- 
bers, were used for the recording of answers by 
the students. 


Performance measures. Designed to be admin- 
istered after the completion of the learning pro- 
gram were three performance measures. The first 
was a performance test of equation solving, con- 
taining 10 simple linear equations of the sort 
encountered in the learning program (see Appendix 
B). This test had a time limit of 25 minutes. 
Scoring allowed for partial correct completion of 
problems, on a scale of 0-4 for each, making the 
maximum possible score 40. The second measure 
was a transfer test (Appendix B), containing 10 
additional linear equations having somewhat un- 
familiar forms and unfamiliar symbols. This was 
a 20-minute test, and was scored in a manner simi- 
lar to that used with the performance test. The 
third test was one designed to measure achievement 
on each of the 22 learning sets identified by the 
analysis previously described. As administered, a 
single item was used to measure each learning set, 
and a total class period of 50 minutes was allowed 
for completion. These items are given in Appen- 
dix A. 


Subjects 


The subjects were members of four different 
mathematics classes, two in the seventh grade of 
Kensington Junior High School, and two in the 
seventh grade of Montgomery Hills Junior High 
School, both located in Montgomery County, 
Maryland. The classes contained students of in- 
termediate abilities, heterogeneously grouped. The 
data of students who missed one or more of the 
sessions involved in the study were eliminated 
from consideration in the results. Of the 144 
students in the four classes, 26 missed at least one 
of the sessions, leaving ап N of 118. 


Procedure 


The administration of the tests and learning 
program was carried out by the experimenter in 
each classroom. The classroom teacher remained 
in the room, for the most part, during these 
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sessions. The teacher made assignments of work 
for students to undertake, at their desks, once 
they had completed each booklet of the learning 
program. These arrangements were made before 
each session began. The assignments were com- 
posed of unrelated materials. 

During the first class session for each group, 
the tests of basic abilities were administered, 
following the directions given by French (1954). 
The learning program was then administered, 
using the eight booklets, on eight successive class 
days. Three school days intervened between the 
administration of Booklets 6 and 7, owing to the 
interruption of school attendance by a severe 
snowstorm. On the day following completion of 
the learning program, the performance test was 
administered in 25 minutes, followed by the trans- 
fer test for 20 minutes. On the following day, 
the test of learning sets was given and the total 
period of 50 minutes allowed for its completion, 
Actually, there were 44 items on this test, 2 for 
each learning set, but since it was determined that 
these could not be completed, the students were 
instructed to do each odd-numbered item before 
going on to even-numbered ones, and the former 
Scores were used in obtaining the measure of 
learning set achievement. 

In administering the learning program, the ex- 
perimenter first read through the instructions 
which appeared on the first two cards of Booklet 
1, These told the students how to use the booklets 
and how to record their answers. Students were 
instructed to write an answer to the question asked 
in each frame, and then to flip the card over and 
compare their answer with the correct answer 
printed on the back of the card. If their answer 
was right, they were to proceed; if wrong, they 
were to draw a line through it, turn the card back 
again until they could see how the right answer 
was obtained, and then record the correct answer 
and proceed. 


At the beginning of each learning session on 
successive days, the experimenter repeated the 
main points of these instructions. The students 
were especially cautioned to record each answer 
before flipping over the card to check the answer. 
(This behavior was monitored throughout by the 
experimenter. They were asked to be certain 
that they knew the material on each card before 
proceeding with the next one. 


During the administration of each booklet of the 
program, students were instructed to draw a line 
at the edge of their answer sheets under each 
question they had just completed, each time the 
experimenter called “Mark!” This signal was 
given every 3 minutes, although the students were 
not told the amount of this interval. These records 
were used to compute learning rate of each section 
of the program. A score of learning rate was 
later obtained for each learning set by adding the 
times taken to complete the frames devoted to 
that learning set, interpolating where necessary. 
Since the program devoted only one frame to each 


of the Learning Sets ЇЇЇЇ and II4, no learning `ў* 
rate score was obtained for these. 


RESULTS 


The results of the study may be described 
in accordance with the following organiza- 
tion. First, we shall be interested in report- 
ing the several measures of learning and 
performance that were obtained, which pro- 
vide a general picture of the outcomes of 
administration of the learning program. 
Second, a matter of major interest is the 
evidence of transfer among learning sets in 
the hierarchy shown in Figure 1. Do results 
indicate that these learning sets mediate 
positive transfer to higher level learning sets, 
as proposed by the theory? Third, what 
correlations were found among relevant and 
irrelevant basic abilities and total measures 
of performance during and after the com- 
pletion of the learning program? And 
finally, there is particular interest in seeing 
whether patterns of increasing and decreas- 
ing correlations are obtained, between abili- 
ties on the one hand and rates of attaining 
individual learning sets on the other. 


Measures of Learning and Performance 


The several measures of learning and per- 
formance applicable to the entire program 
are shown as means with associated standard 
deviations in Table 1. 


TABLE 1 


MEANS AND STANDARD DEVIATIONS ОЕ SEVERAL 
MEASURES OF LEARNING AND PERFORMANCE 
APPLICABLE TO THE ENTIRE LEARNING 


PROGRAM 
(N = 118) 
Measure M SD 
Performance on equation solving 
test (10 items, 40 points) 8.4 4.6 
Transfer performance in solving 
unfamiliar equations (10 
items, 40 points) 6.1 3.7 
Total number of learning sets 
achieved (22 items) 12.4 3.4 
Time to complete program 
(learning rate) in minutes 221.4 22.4 


It may be seen that in terms of perform- 
ance in equation solving, the learning pro- 
gram was not very successful This is also 
indicated by the scores on the transfer test 
containing 10 equations of somewhat un- 
familiar form and symbolic content, scored 
with a maximum of 40 points. The mean 
number of learning sets achieved, as meas- 
ured by a test administered following the 
program, was 12.4, another indication of the 
moderately low effectiveness of the program. 


~ Measures of Transfer among Learning Sets 


It will be recalled that the theory predicts 
high positive transfer from a recalled learn- 
ing set (or sets) at a given level and attain- 
ment of the adjacent higher relevant learning 
set(s). This prediction can be tested by 
noting the pattern of pass and fail which 

„ obtains between the lower and higher ad- 

jacent sets throughout the hierarchy. It may 
be noted that where two or more subordinate 
learning sets are involved (for example, as 
ПІ and 112 are related to Il, Figure 1), 
the prediction is that all lower relevant sets 
must be passed, in order for the higher level 
set to be attained. 

The four possible empirical relationships 
for passing and failing relevant higher-lower 
learning set combinations are as follows, 
together with the theoretical significance of 
each: 

1. Higher+, Lower*: This relation indi- 
cates the occurrence of positive transfer from 
lower learning set(s) to adjacent higher 
learning set(s), and is in accord with the 
theory. 


2. Higher-, Lower-: If any relevant ad- 


E lower level set has been failed, posi- 


tive transfer to the higher level learning set 
is unlikely; this outcome is also in accord 
With the theory. 

3. Highert, Lower-: As indicated in 2, 
this outcome is directly opposed to the theo- 
retical prediction. 

4. Higher-, Lower: When this outcome 
_ takes place, it is not in opposition to the 

‘theory. A higher level set may be failed, 

Sven though all relevant lower level sets are 
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passed. A number of reasons make this oc- 
currence possible, most of them associated 
with the effectiveness of the learning pro- 
gram. 

The patterns of pass-fail relationships 
indicated by data obtained on the test of 
learning set achievement administered fol- 
lowing the learning program are given in 


Table 2. 


Each of the transfer relationships which 
can be measured between learning sets in 
the hierarchy is listed in the first column of 
this table. As would be expected with a 
program that is not perfectly effective, the 
number of ++ relationships between higher 
and lower learning set combinations goes 
steadily down as the learning program 
carries the learner upward in the hierarchy, 
whereas the number of — relationships in- 
creases. Instances of higher +, lower — are 
cases of positive transfer without the achieve- 
ment of lower level learning sets, and are 
therefore contrary to theoretical prediction. 
It is noteworthy that such instances are very 
low in frequency. The proportion of pass- 
fail patterns which support the theory, ob- 
tained by dividing the total testable instances 
(++, —, +—) into the number of instances 
consistent with the hypothesis of positive 
transfer (++, ——) is shown in the final 
column. Some of these proportions are 1.00, 
and none is lower than .91. 

The theoretical prediction is for the values 
in the final column of the table to be 1.00. 
Obviously, they are very nearly that, and 
far above the purely chance values of these 
patterns, which would range between .25 
and .50. One reason, irrelevant to theory, 
for their departure from the 1.00 value is 
of course unreliability of the measurements 
from which they are derived. It will be 
recalled that the measures employed are 
single item pass-fail scores. The use of two 
or more items to assess learning set attain- 
ment would have made possible an estimate 
of the reliability of these measures, but un- 
fortunately these were not used in the 
present study. In future studies, provision 
should surely be made for reliability meas- 
ures. At any rate, on the basis of present 
evidence the prediction of positive transfer 
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TABLE 2 


Pass-FAIL PATTERNS OF ACHIEVEMENT BETWEEN ADJACENT LOWER AND HIGHER LEVEL RELEVANT 
LEARNING SETS, AND THE PROPORTION OF INSTANCES OF POSITIVE TRANSFER INDICATED 


(N = 118) 
Frequency of pass-fail 
Pattern—Higher, Lower Total 
Transfer to testable Proportion 
learning set frequency positive 
de idee hon Е eee) MEETS ЛА УНО, transfer 
(69) (2) (3) (lar 
IV2 from ТУА1 110 0 0 8 110 1.00 
ІУ5 from IVA3 113 0 0 5 113 1.00 
IIIA1 from IV2, IV3 85 0 7 26 92 .92 
IIIA2 from 1V4, IV5, IV6 94 5 10 9 109 .91 
III1 from IV1 45 9 1 63 55 .98 
III2 from ТУЗ, ША1 68 30 6 14 104 .94 
ПІЗ from IVA2, ТУЗ 75 25 7 11 107 .93 
ПИ from ПА2 62 40 + 12 106 .96 
II1 from IV2, 1112, III3 34 70 3 11 110 .95 
112 from IVA2, 1113 41 60 2 15 103 .98 
113 from III4 37 12 3 6 20112. .97 
II4 from 1114 9 85 0 24 94 1.00 
I1 from ПІ, 12 25 78 2 13 105 .98 
I2 from II2, II3 28 80 3 7 111 .97 
I3 from 113, 114 6 104 0 8 110 1.00 


mediated by learning sets appears to be 
amply supported. 

The values for the —+ pass-fail pattern 
deserve some comment. As has been said, 
these are not critical evidence for or against 
the theory. What they do indicate, presum- 
ably, is relative weakness in portions of the 
learning program. For example, 63 of the 
118 learners failed to progress from Learn- 
ing Set ТУІ to Learning Set ШІ, after 
having achieved the former adequately. Ex- 
amination of the program shows that this 
particular set (1111, identifying needed op- 
erations in order) was covered by only a 
single frame. А similar finding holds for 
Learning Set 114, which has 24 failures after 
success at a lower level. Learning Set ПТА1, 
with 26 failures following lower level suc- 
cesses, is represented by only two frames. 
The occurrence of these instances suggests 
strongly that the values in the — column 
of Table 2 indicate the points at which 
greater or lesser effectiveness was attained 
within the total learning program. High 
values may be interpreted as revealing the 


points at which learning was relatively in- 
effective, low values the points at which 
learning was effective. 

The measurement of learning sets makes 
it possible to assess in a fairly exact fashion" 
what the individual has learned or has not 
learned from a learning program. Examples 
of the patterns of learning set attainment in 
a low achiever and a high achiever are shown 
in Figure 2. 


Fic, 2. Examples of patterns of attainment of 
learning sets in the hierarchy for two individuals, 
a low and a high achiever. (Scores on the final 
performance test of equation solving are indicated 
for each individual in the box denoting the final 
task.) 


Basic Abilities and Measures of Performance 


Product-moment correlations between the 
basic ability measures and four total per- 
formance measures are given in Table 3. 

On the whole, these coefficients yield an 
expected pattern. Particularly notable is the 
fact that correlations between each of the 
basic variables Number, Symbol Recogni- 
tion, and Integration (measured by tests 
numbered 1, 2, and 3) and each of the four 
performance measures (6, 7, 8, and 9) are 
moderately high. It will be recalled that 
these three abilities are considered to be 
relevant. In contrast, correlations of the 
irrelevant ability measured by Vocabulary 
with these performance variables are fairly 
low, despite the fact that this test is pre- 
sumed to be highly related to general intelli- 
gence. It will also be noted that Letter А 
exhibits higher correlations with perform- 
ance measures than would be expected of an 
irrelevant ability. Although correlations with 
this test are all slightly lower than those for 
' the relevant abilities, nevertheless these re- 

Sults are not entirely in accord with our 

predictions. Of course, it is possible that 

speed of Symbol Discrimination is, after all, 

à relevant basic ability; but our analysis did 

not predict it to be so. 

Correlations among the performance meas- 
res themselves are fairly high. In general, 
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individuals who did well on the final test of 
performance were those who mastered the 
greatest number of learning sets, and took 
the shortest time to complete the program. 
Achievement on the final test is also highly 
related (.84) to achievement on the transfer 
test; it appears likely that these tests are 
measuring the same capability. 


Relations between Abilities and Learning 
Sets 


Point-biserial coefficients of correlation 
were obtained between each basic ability 
measure (relevant and irrelevant) and pass- 
fail achievement of each learning set as 
measured following learning. Product-mo- 
ment coefficients of correlation were obtained 
between each basic ability measure and rate 
of learning (time to complete the required 
frames, with sign reversed) for each learn- 
ing set in the hierarchy. For both sets of 
data, interest centered in the pattern of in- 
creasing or decreasing correlation which 
might be revealed as one considers the learn- 
ing sets from the bottom of the hierarchy 
upwards. It will be recalled that the theory 
predicts: decreasing correlation of relevant 
abilities with rates of attaining learning sets 


5 Complete data on intercorrelations of these 
variables are given in Appendix C. 


BEFORE LEARNING AND FOUR MEASURES OF PERFORMANCE 


TABLE 3 
Propuct-MoMENT COEFFICIENTS OF CORRELATION AMONG ТЕЅТ OF BASIC ABILITIES ADMINISTERED 


(N = 118) 


| Measure 1 


: Addition, Subtraction-Multiplication 9r 
Picture-Number 

3. Following Directions 

4. Vocabulary V-1 

5. Letter A 

б. Performance test 

7. Transfer test 

8. Number of learning sets achieved 

9. Time to complete program (sign reversed) 


41 33 04 38 68 66 58 55 
«КЕМЕ С, 14 34 62 58 56 54 
— 12 34 58 54 53 52 
94b 05 22 14 12 18 

Si TSA 45 46 50 

— 8 82 78 

E 78 75 

88 82 

84> 


^ Note.—Reliabilities shown in diagonal. 


* Correlation between Addition and Subtraction-Multiplication. 


^ Odd-even split-half correlation. 
* Unavailable. 
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as one progresses upwards; increasing cor- 
relation of relevant abilities with achieve- 
ment of learning sets, proceeding in the 
same direction; and moderately low and 
constant correlations of irrelevant abilities 
(and general intelligence) with both learning 
rate and achievement. 

Examination of Figure 1 will reveal how 
the relevant learning sets were selected for 
this treatment of the data. For Symbol 
Recognition ability, the relevant learning 


ol 
o 


SIZE OF CORRELATION (MDN) 
a : 
o 


o 


sets begin with IVA1, 2, and 3; at Level IV, ¢ 
however, only Learning Sets IV2, IV4, IV5, 
and IV6 are relevant (IV1 and IV3 are 
connected to Symbol Recognition neither by 
an arrow nor by an intermediate learning ^ 
set at Level IVA). At higher levels, all the 
learning sets are relevant to Symbol Recog- 
nition except for ПІІ, since they are con- 
nected to this ability via intermediate learning 
sets. For Number Ability, relevant learning 
sets begin with ТУЗ and IV5; again, all 


IV ША M Ш I 


—LEVEL— 


3a. The relationships with learning rate, based 


3b. The relationshi; ith achi ased on 
upon product-moment correlations. De viki achievement 


point-biserial coefficients. 


Fic. 3. Median values of coefficients of correlation of relevant basic abilities (Number, Symbol - 
Recognition) and irrelevant basic abilities (Speed of Symbol Discrimination, Vocabulary) with meas- 0 
ures of learning sets at different levels of the learning set hierarchy. (For irrelevant sets, medians of у 
the highest three coefficients are shown, except for Level IIIA which contains only two. In all cases,’ 4 
straight lines have been fitted to the sets of points by eye.) 
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others are relevant except for III. For 


Integration, the only relevant sets are IV1, 
ПП, and 12, and these data were not plotted 
because of the sparsity of points which could 
be determined. 

Figure 3 graphs the relationship of the 
size of the median correlation obtaining be- 
tween relevant basic abilities and learning 
sets and between irrelevant basic abilities 
and learning sets, to the level of the hier- 
archy in which the sets are located. Figure 
3a pertains to learning rate; Figure 3b, to 
achievement of learning sets. In order to 
provide an idea of the slope of the relation- 
Ships, straight lines have been fitted to the 
points by eye. The median correlation at 
each level was determined from values of 
the particular number of relevant learning 
sets at that level. For the irrelevant sets, 
the median of the three highest correlations 
was taken (excépt for Level IIIA, where 
only two exist), in order to have the number 
of values on which to base a median com- 
parable to those for relevant sets. 

Considering the data shown in Figure 3a, 
it is apparent that there is quite a marked 
difference between the changes that occur in 
the correlations between relevant sets and 
learning rate, and those between irrelevant 
sets and learning rate, as one progresses up- 
wards in the hierarchy. Relevant abilities 
are correlated highly at low levels of the 
hierarchy, and markedly decrease in correla- 
tion at high levels. In contrast, the changes 
in correlation with irrelevant abilities, while 
not absent, exhibit an extremely small slope. 
These results, then, tend to confirm the 
theoretical prediction that rate of learning 
depends decreasingly upon relevant abilities 


~as learning progresses upwards in the hier- 


archy. In addition, irrelevant abilities exhibit 
moderately low and nearly constant values 
of correlation with the rate of attaining 
learning sets. 

The picture for correlations of abilities 
with achievement of learning sets (Figure 
3b) is nearly the opposite in every respect. 
Here, there is a pattern of increasing cor- 
relation between relevant abilities and learn- 
ing set achievement. The explanation of 
this trend is not crucially dependent on 
theory in this case, but is based on the rea- 


13 


soning that increasing numbers of individuals 
effectively “drop out" as learning proceeds; 
thus the variance which remains becomes 
more clearly that of relevant basic abilities. 
Again, in the case of irrelevant abilities, the 
change in the size of correlation is not nearly 
so great. The smallest slope is obtained for 
Vocabulary, often considered to be a meas- 
ure of general intelligence, while Speed of 
Symbol Discrimination attains an intermedi- 
ate slope. On the whole, the contrast between 
the behavior of relevant and irrelevant sets 
is a marked one, nearly as striking as that 
which obtains for learning rate (Figure 3a), 
but in the opposite direction. 

The correlations obtained between learning 
rate and Integration were .76 with IV1 and 
43 with I2; it will be recalled that the 
learning rate of III1 could not be measured. 
This test, too, shows the decreasing pattern 
found with other relevant abilities. For 
achievement of learning sets, the values of 


.the correlations were .53, .50, and .56 at 


Levels IV, III, and I, respectively. These 
data do now show the kind of increasing 
pattern which obtains for the other relevant 
abilities. 


Relations among Learning Sets 


If the rate of learning for progressively 
higher learning sets in the hierarchy depends 
decreasingly upon relevant basic abilities, 
then, according to the theory, it must come 
to depend increasingly upon attained knowl- 
edge, that is, on the successful achievement 
of relevant subordinate learning sets. Ac- 
cordingly, the data were next examined for 
evidence of this latter relationship. 

Table 4 contains median point-biserial co- 
efficients of correlation for relationships 
between achievement of each learning set 
and rate of learning of relevant adjacent 
higher learning set or sets. These are com- 
pared in each case with correlations between 
achievement of the same learning set and 
irrelevant adjacent higher learning sets.* 
The table indicates the adjacent higher sets 
in each instance from which the median 


6 The individual correlation coefficients are given 
in Appendix C. 
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correlation was derived; the distinction be- 
tween relevant and irrelevant sets was of 
course derived from the theoretically pre- 
dicted connections shown in Figure 1. (This 
analysis of the data begins at Level IV, 
since achievement of learning sets at Level 
IVA was virtually perfect, and thus provided 
no variance.) 

Generally speaking, the correlations of 
achievement and learning rate for relevant 
pairings show a moderate rise as one goes 
upwards in the hierarchy. In addition, each 
of the correlations for relevant pairs is 
higher than the corresponding correlation 
for irrelevant pairs. Tests of significance of 
these differences (utilizing the median in- 
tercorrelations shown in the next-to-last 
column to obtain a value of £ applicable to 
correlated samples) are shown in the final 
column of the table. Half of these com- 
parisons attain a satisfactory level of signifi- 
cance, while half do not, At the higher 
levels of the hierarchy, all the comparisons 
are significant ones, indicating that learning 
rate at these levels is dependent upon achieve- 
ment of adjacent subordinate learning sets. 


This finding is to be contrasted particularly 
with the previous one concerning the rela- 
tions between basic abilities and learning 
rates; it will be recalled that for these 
variables, relationships pertaining to relevant 
ones become indistinguishable from irrele- 
vant ones at higher levels of the hierarchy. 
It does appear, therefore, that rate of attain- 
ment of learning sets comes to depend in- 
creasingly upon knowledge achievement (as 
indicated by successful achievement of ad- 
jacent subordinate learning sets), and de- 
creasingly upon amounts of basic abilities. 


Discussion AND IMPLICATIONS 


The results have indicated, first of all, 
that the acquisition of individual capability 
in solving linear equations, established by 
a learning program, may be conceived as a 
matter of attaining a hierarchy of learning 
sets which may collectively be called knowl- 
edge. A final task of this sort may be 
analyzed to reveal a supporting hierarchy of 
learning sets by asking the question: “What 
would an individual have to know how to do 


TABLE 4 


MEDIAN COEFFICIENTS OF CORRELATION 


(ты) OF RELEVANT AND IRRELEVANT PAIRINGS COMPRISING 
LEARNING SET ACHIEVEMENT Vs. LEARNING RATE 


OF ADJACENT HIGHER LEARNING SET(S), WITH 


SIGNIFICANCE OF DIFFERENCES BETWEEN THESE COEFFICIENTS 


(N = 118) 
Relevant pairing Irrelevant pairing 
Achievement Mdn. р of 
on learning i Tobi difference 
set: Learning rate of: Mdn. Learning rate of: Mdn. 

(1) Q) Tpbi ты (2) уз. (3) (712-718) 
IV2 11; ШАІ 43 112, 3; IIIA2 25 3 .05« .10 
IV3 III2, 3; ША1 44 III4 19 2s 7 о 
IV4 IIIA2 34 ШАІ 18 26 2.10 
IV5 ША2 33 IIAL 19 26 >.10 
IV6 ША2 35 ША1 21 26 >.10 
ША1 IH2 32 III3, 4 21 29 2.10 
IIIA2 I4 35 П12, 3 22 28 2.10 
III2 ш 38 12,3 22 40 >.05<.10 
ПІЗ 11,2 42 пз 18 29 «.02 
пи из 41 I11, 2 19 29 «.05 
It п 55 12,3 30 37 <.01 
II2 11, 2 54 I3 25 36 «.01 
II3 12, 3 49 п 24 37 <.01 
п 13 57 11,2 30 36 «.01 
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in order to perform this task, after being 
given only instructions?" The answer to the 
question defines immediately subordinate 
learning sets, and the question may then be 
applied to these tasks in turn to define the 
next subordinate level of learning sets. By 
theory, each learning set is conceived to 
function by mediating positive transfer to a 
higher level task or tasks. 

In this study, süch an analysis defined 22 
subordinate learning sets arranged in hier- 
archical fashion. When the question de- 
scribed previously was applied to those sets 
occurring at the next to the lowest level, the 
answer defined the additional tasks of Num- 
ber, Symbol Recognition, and Integration, 
which appeared to be the same as those used 
in certain factor-reference tests. 

Relationships among the variables meas- 
ured in the study lead to the conclusions that: 


"(a) a high incidence of positive transfer 


(nearly 10096 of the instances tested) ob- 
tains from success on relevant subordinate 
learning sets to attainment of a superordinate 
learning set; (b) a decreasing pattern of 
correlations can be shown between relevant 
basic ability factors and rate of learning 
for learning sets as one progresses upwards 
in the hierarchy; (c) in contrast, correla- 
tions of rate of learning with irrelevant 
ability factors show only a slightly decreas- 
ing trend; (d) learning rate of the learning 
sets in the hierarchy depends increasingly 
on acquired knowledge (in the form of 
subordinate learning sets) and decreasingly 
upon basic ability factors as one proceeds 
from the bottom to the top of the hierarchy. 


Implications for Ability Measurement 


These findings may have important impli- 
cations for the measurement of abilities 
involved in any human task. For one thing, 
they emphasize the importance of measures 


| of rate of learning as criteria against which 


the predictive efficiency of an aptitude test 
may be assessed. There appears to be a 
fairly clear rationale for relationships be- 
tween an initial ability score and rate of 
learning of a relevant task, if one assumes 
that the ability represents a level of capa- 
bility which mediates a greater or lesser 


amount of positive transfer. Basic abilities, 
or factors, may accordingly be conceived as 
the most simple and most general learning 
sets, which support a great variety of more 
complex activities. Such support manifests 
itself as greater or lesser facilitation of the 
rate of learning or related tasks. 

The suggestion is, therefore, that a true 
integration of measures of productive learn- 
ing and aptitude measures can be effected 
by identifying and correlating rate of learn- 
ing of learning sets in a hierarchy with basic 
ability factors that are defined by means of 
the same analysis. Such a procedure would 
make possible the testing of hypotheses- in 
aptitude studies, in much the same fashion 
as has been done in the present study. It 
would also reveal a rational basis for the 
stateinent that human tasks “depend upon" 
or "are related to" aptitude factors. And 
particularly, such studies should make .pos- 
sible a suitable determination of the relative 
contribution of knowledge variables, as com- 
pared with basic ability variables, in the 
achievement of performance on a final task 
(representing a class of tasks). 

The correlation of basic ability factors 
with achievement, our results suggest, is 
really a more complex matter for which the 
rationale is less clear. Specifically, the 
achievement of any given task (which may 
be any learning set in a hierarchy) depends 
not only upon the amount of basic ability, 
but also upon the amount and kind of spe- 
cifically transferable knowledge that has 
been acquired. Аз learning proceeds to 
higher level learning sets, relevant abilities 
will correlate to an increasing extent with 
achievement only to the extent that the 
learning program is ineffective in mediating 
transfer from one level of the learning set 
hierarchy to the next. A "power" test of 
achievement, containing items of increasing 
complexity, may be conceived as a deliber- 
ately inefficient learning program, whose 
correlation with basic ability factors depends 
upon the fact that increasing numbers of 
people “drop out" as they attempt more and 
more items of the test. This is, of course 
a perfectly good rationale for such a test. 
But it does fail to distinguish between the 
contributions made to performance by knowl- 
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edge factors as opposed to ability factors. 
For precise prediction of performance on 
an achievement test, one must measure the 
recallability of subordinate learning. sets, 
rather than search for additional basic 
abilities. 


Learning Programing 


As has been pointed out in a previous 
article (Gagné, in press) the theory of 
learning set hierarchy has a number of im- 
plications for the programing of productive 
learning. Chief among these is the idea of 
designing the frames of a program in such 
a way that they: constitute an ordered se- 
quence logically related to the hierarchy of 
learning sets for the desired final task, pro- 
vide for recallability of subordinate learning 
sets, and furnish the guidance to thinking 
which will enable the learner to integrate 
subordinate learning sets in the performance 
of new tasks. The learning program used 
in the present study was not constructed 
according to these principles. Nevertheless, 
the evidence has some things to say about 
them, 

Although the program was not deliber- 
ately designed to establish the learning sets 
identified by theoretical analysis, the evi- 
dence indicates that this is what it did do, 
to the extent that it was effective. Further- 
more, there is good evidence that when the 
particular learning sets required for new 
learning were present in the individual, high 
positive transfer resulted; when they were 
absent, very low transfer took place (cf. 
Table 1). It is also significant to note that 
the analysis, carried out according to theory, 
was able to identify three learning sets which 
were inadequately emphasized within the 
program (Learning Sets IIIA1, TITI, and 
II4, Figure 1) in the sense that only one or 
two frames was devoted to each. The evi- 
dence suggests that transfer was particularly 
low at just these points in the program. It 
seems reasonable to conclude that the analy- 
sis of a final task into subordinate learning 
sets can be an important, if not essential, 
first step in the development of an effective 
learning program. 
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The use of learning set analysis to identify 
the required sequence of a learning program, 
and later to measure its effectiveness, can 
provide a solution to a measurement diffi- | 

. culty encountered by methods in common use ў 

at'present. On logical grounds, internal cri- ' 
teria of program effectiveness such as time 
to complete frames or number of errors per 
frame, cannot be considered convincing evi- ` 
dences of how well a program is accomplish- 7 
ing what it is designed to do. Measures of | 
final performance or of transfer are beyond 
doubt best suited for this purpose. However, | 
the latter measures are not particularly 
diagnostic of the relative effectiveness of 
portions of the learning program. The | 
measurement of learning sets has the advan-. 
tage of accomplishing both these purposes: * 
it can diagnose weak points, and it ca 
measure the achievement of. the learner u у 
to any point at which he “loses under- 
“Standing,” or effectively “drops out" The — 
evidence of this study shows that the meas- 
urement of learning sets can provide the 
experimenter with relatively precise infor- 
mation about the progress‘ @fi the individual 
throughout the course of learning. 


SUMMARY 


An analysis was made of the class of tasks 
“solving linear algebraic equations,” follow- 
ing the outline of a learning program de- 
signed to establish proficiency in these tasks, 
to identify a hierarchy of learning sets which 
support the attainment of such proficiency. 
This analysis was based upon a theory to 
the effect that attainment of any given 
learning set is dependent on recallability of А 
certain subordinate learning sets, instructions | 
defining the stimuli and goal ofthe new task, 
and integration by the learner of subordinate 
learning sets into the solution of the new 
task. Subordinate learning sets are conceived 
as having the function of mediating positive 
transfer to higher level learning sets through- 
out the hierarchy, and ultimately to the final 
task. Subordinate learning sets ог a given 
class of tasks may be defined as the answer 
to the question: “What would the individual 
have to be able to know how to do, in order j 
to be able to perform this (new) task, being ` 
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given only instructions?" Beginning with the 
final task, the question is applied successively 
to each learning set so defined, and thus 
identifies a progression of learning sets which 
"grow increasingly simple and increasingly 
| general. 

Besides defining a hierarchy of 22 learn- 
ing sets, three additional ones, represented 
by very simple tasks, were derived from this 
L analysis at the lowest level of the hierarchy. 
- These appeared to be identical to three tasks 
occurring in ability tests which have been 
identified as relatively "stable" in factorial 
studies, the so-called "factor-reference" tests. 
Specifically, these tasks appeared to be those 
involved in tests measuring Number Ability, 
А (Т; t 

Symbol Recognition (a particular form of 
"Associative Memory), and Integration I (an 
bility apparently involving keeping several 
T rocedures in mind at once, as in "following 
directions”). The occurrence of these abili- 
ties as an end result of a theoretical analysis 
of the knowledge composition of a learning 
program naturally raised the interesting 
question as to how they function in support 
of the learning of the final performance, as 
well as of the intervening learning sets in 
the hierarchy. 

According to theory, basic abilities which 
are relevant to learning sets in the hierarchy 
should mediate positive transfer to them, 
and this in turn should be measurable as an 
incredsed rate of learning. In progressing 
upwards in the learning set hierarchy, cor- 
relations of these abilities with rate of 
attainment of relevant learning sets should 
decrease, since such relations come to depend 
increasingly upon transfer from immediately 
subordinate learning sets (1.е., upon specific 
knowledge). General intelligence, while it 
may be expected to correlate to a moderate 
degree with rate of learning throughout the 
hierarchy, should show ло change with po- 
sition of learning sets at various levels of the 
hierarchy, since it is conceived to mediate 
general, rather than specific, transfer. Тһе 
same is true for basic abilities which are 
irrelevant to the learning sets to be learned, 
in the sense that they have not been con- 
‚ nected to them by analysis. As for correla- 

tions of basic abilities with achievement 
(pass-fail) of relevant learning sets, the 
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prediction would be that these would exhibit 
an increasing pattern as one progresses up- 
wards in the hierarchy. The reasoning is 
that, to the extent that the learning program 
is ineffective in reducing all individual dif- 
ferences in achievement, increasing numbers 
of individuals “fail to grasp” the program 
as it proceeds; consequently, the relationship 
comes to depend increasingly on a differ- 
entiation of high from low ability. Thus 
increasing correlations of achievement with 
relevant basic abilities provide an inverse 
indication of the effectiveness of the learning 
program. If the program were perfectly 
effective, differences in achievement would 
of course not be measurable. 


To test these predictions, a study was 
conducted by administering the learning pro- 
gram on equation solving to a group of 118 
seventh graders in four different school 
classes. The program was divided into eight 
booklets, and administered during eight suc- 
cessive class days in the school room, Pre- 
ceding this administration, factor-reference 
tests were given to obtain measures of basic 
abilities, both relevant (Number Ability, 
Symbol Recognition, Integration I), and ir- 
relevant (Speed of Symbol Discrimination, 
Vocabulary). During the learning program 
administration, students were required to 
mark the margins of their answer sheets at 
3-minute intervals, to provide the basis for a 
measure of rate of learning. Following com- 
pletion of the program, in another class pe- 
riod, two 10-item tests were administered to 
measure performance in equation solving, 
and transfer to the solving of equations hav- 
ing somewhat unfamiliar symbols and form. 
Finally, a test was given to measure achieve- 
ment of the 22 learning sets in the hierarchy. 

Predictions were confirmed in the follow- 
ing respects: 

1. Correlations of theoretically relevant 
basic abilities were higher than those of 
irrelevant basic abilities with measures of 
final performance, with transfer of training 
scores, with number of learning sets achieved, 
and with rate of learning of the total pro- 
gram. 

2. Instances of positive transfer to each 
learning set from subordinate relevant learn- 
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ing sets were found to occur throughout the 
hierarchy with proportions ranging from .91 
to 1.00. 

3. Correlations of relevant basic abilities 
with rates of attainment of learning sets at 
progressively higher levels of the hierarchy 
showed a steeply progressive decrease. In 
contrast, the pattern of correlations of irrele- 
vant basic abilities with the rates of attain- 
ment of comparable learning sets remained 
nearly constant. 

4, Correlations of relevant basic abilities 
with achievement of learning sets at progres- 
sively higher levels showed an increasing 
pattern, whereas the comparable pattern with 
irrelevant abilities exhibited at the most a 
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APPENDIX А 
TASKS USED IN THE MEASUREMENT OF LEARNING SETS 


IVA1: Equivalence of 1x and x 
Which term is the same as x? 3x; 10x; —2x; y; 1x; 8x; —y 


IVA2: Identification of an Equation 
Which of the following is an equation? 


8y -- 2 — 10 (8у + 2)10 
ву +2 = 10 E 


IVA3: Obtaining Products with Zero 
Which of the following are equal to zero? 


5g:3 4r-0 
2-11 16-8k 
0-157 Tf. Te 


IV1: Procedural Order 
The order to be followed in solving an equation is: 


LCRD C— Collect like terms 

RCLD D—Divide by the coefficient of “x” 
LRCD R—Transpose all other terms to the right 
DRCL L—Transpose all “х”' terms to the left 


IV2: Recognition of Equivalent Terms 
Which term on the right can be combined with (added to or subtracted from) each term on the left? 
For every term on the left there is one term on the right which can be combined with it. 


1. 3(a — b) 14s 9g 5m —6n 
2. 127 Sp 40 —4a ipr 
3. 6ab —12ab es a—b = 
Е " y 
Dod 16a (к — у) 
y 
5. 16 


IV3: Performing Addition and Subtraction of Numbers in Sequence 
з+9 +27 – 13 +2 – 39 – 5 + 68 = 


1V4: Recognizing Equivalence of Multiplication and Division Terms 
Which three terms mean 34a multiplied by bc? 


4 
Made ЕЕ 34a + be 5 
34a — bc 34ab-¢ 34abc 


34-2 34a = bc 34a + bc 
bc 
IV5: Performing Multiplication of Numbers in Sequence & 


Multiply: 2, 8, 6 


IV6: Division of Parenthetical Terms * 
700 +b —5) _ 
3bQa +b — 5) 
ША1: Combining Fractions with Like Denominators 
3 4 REN 


ety aty wy 
IIIA2: Simplifying Fractional Expressions 


3-8-4-9-60 _ 
10-3-12-2 


—— ee Ст P4 — 
=». = — —p—! -—- VS ——— 
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ШИ: Identifying Needed Operations in Order y 
Given this equation and the steps in its solution (in mixed-up order), place the steps in their proper | 
order and label the operation to be done оп each step (in order to get to the next step). Use the steps listed 
in Question 4. 
Equation: 4x — 2 = х +3 Steps: 5 
4х—х—2=3 


Ax—x-3-2 


© 
8 
1 


1112: Addition and Subtraction of Terms in Sequence 
2y +72 — 8y — 4w + y + 140 – = 
ПІЗ: Supplying Sum and Difference Equivalents to Sums and Differences (Arithmetic Numbers) 
2--1—5-c1—623—6-8-F2-c? 
III4: Supplying Product and Quotient Equivalents to Products and Quotients (Arithmetic Numbers) 
2-4-3 _ 2-3-4 
6.2 3:6 
11: Supplying Sum and Difference Equivalents to Sums and Differences (Terms) 
3g + Sf +h = 2g —3f — 4h +? 
112: Simplifying an Equation by Adding and Subtracting Arithmetic Numbers to Both Sides 
Solve for w: 16+w-10=7+4-2 


113: Simplifying an Equation by Multiplying and Dividing Both Sides by Arithmetic Numbers 


Solve for t: 3 = 10 
114: Supplying Product and Quotient Equivalents to Products and Quotients (Terms) 
$s5rg _ 5:44) 


10 10r 


I1: Simplifying an Equation by Adding and Subtracting Terms to Both Sides 
Solve for b: Tb + 2a + 3b — а = 10b + 2a + 35 — b 


I2: Simplifying an Equation by Multiplying, Dividing, Adding, and Subtracting Arithmetic Numbers 


Solve for a: 20 = 18 + E: El 
I3: Simplifying an Equation by Multiplying and Dividing Both Sides by Terms : 


6(3 — e) 


Solve for e: AE 
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APPENDIX B 


Test for Equation Solving 


. Solve for b: 


. Solve for z: 
. Solve for a: 
. Solve for y: 
. Solve for m: 
. Solve for x: 


. Solve for a: 


8. Solve for p: 


. Solve for x: 
. Solve for g: 


2b — 3 — 8b — 4 + 3b = 13 — 6 — 3b — 2 — 6b 


32-4 _ 46-2 
S NUS 
3a-5.2 10-9-4-2 
ОШ: ee ry сары 
4y +12 
3 = 12 
om — oma 
4 
7 76:-8 
4a , 6a 
T T р = 20 +4 


7x + 4х = 3a + За + 2a — х 
2g — 3h +2 +g = 7 — 8h — 2g 


Transfer Test (Equations) 


- Solve for F: 


. Solve for L: 


3. Solve for p: 


. Solve for i: 
. Solve for g: 


. Solve for N: 


7. Solve for c: 


. Solve for Т: 


9. Solve for Q: 


. Solve for x: 


2 
44+ L4+G43L) +64 (41.0) 3L-4-2-14 

3 NEST 
2 3р +39 == ра + 3р8 – 27 218-44 — 6 3p —r 
AU ae ое MU 
gro безе а 
Sea 198 


ee Oar в) Ne ee 


MEE 12 
3N-N-2. Me 
NEU S0 EU 
6a —6b , 3 kd 
ЗҮН aie е 
аА у) 
3060 —4) 300,3 2 

0 = 20609 Qt 90 


бх + 4у +2 — 2x +y — 4z = 


21 


22 
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APPENDIX C 


TABLE C1 


JE ) Але 


ASIC ABILITIES AND RATES OF ATTAINING LEARNING SETS 
| 


(Propuct-MoMENT) AND BETWEEN THE SAME ABILITIES AND ACHIEVEMENT OF LEARNING SETS 
(PorNT-BISERIAL) 
(N = 118) 


Ability factors 


Achievement 
A 
[ 


Learning rate 
Learning 
set i Speed © Speed 
Symbol Inte- of Sym- ymbol  Inte- A of Sym- 
Num: Recog- gration ука, bol Dis- pam: Recog- gration yo bol Dis- 
nition I чагу — crimi. T — nition I чагу — erimi- 
nation nation 
IVAL 38 80 60 26 40 
IVA2 40 75 54 24 28 
IVA3 7 76 51 28 33 
IV1 34 48 76 30 28 40 20 53 02 28 | 
IV2 38 74 57 26 18 30 33 30 02 47 D 
IV3 82 54 49 28 28 24 29 05 01 11 | 
IV4 69 71 45 26 33 29 22 11 08 34 | 
ІУ5 85 53 40 32 40 31 25 14 09 21 | 
IV6 69 69 38 24 28 26 23 20 06 20 
IIAL 64 58 40 22 40 26 52 34 22 33 
IIIA2 74 55 34 20 33 40 50 26 14 29 
Hn 50 49 50 08 42 
1112 44 54 34 18 30 59 51 41 24 37 
ПІЗ 60 55 25 21 24 41 35 27 09 43 
III4 62 34 28 24 26 68 60 43 15 47 
п 41 33 41 20 18 62 52 33 17 11 3 
12 54 42 30 18 33 47 42 44 25 49 
113 53 44 27 24 18 69 59 50 15 57 
In 70 63 34 09 23 
It 38 27 20 19 26 63 57 41 11 36 
12 51 40 43 23 32 70 71 56 20 53 
I3 57 22 18 23 40 40 30 24 10 23 
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TABLE C2 


. COEFFICIENTS OF CORRELATION (POINT-BISERIAL) BETWEEN ACHIEVEMENT OF EACH LEARNING SET AND 
LEARNING RATE oF EACH HIGHER-LEVEL LEARNING SET 


(N = 118) 
Achievement Rate of learning 
of learning 

set ША1 ША2 112 шз пи ПЕТ п 12 13 
IV1 20 28 31 34 25 26 24 32 28 34 30 
IV2 38 18 44 24 32 48 30 25 54 28 29 
IV3 39 18 43 46 19 49 50 24 58 4 26 
IV4 18 34 22 20 44 19 26 48 28 50 53 
IVS 19 33 21 19 45 18 24 49 28 55 64 
IV6 21 35 19 17 46 16 26 48 25 52 66 
IIIA1 32 24 18/5542 19/424 4:522 009 50 0 1857.25 
IIIA2 24 20 35 21 18 41 26 52 57 
In 22 20 18 28 26 28 
ПІ2 38 26 18 55 24 26 
ПІЗ 38-7 46.18: 555600144; 126 
пи 18 20 41 26 44 57 
In 5500734 94726 
112 58 54% 125 
пз 24 48 50 
П4 26 34 57 
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A COMPARATIVE AND ANALYTICAL STUDY OF 
VISUAL DEPTH PERCEPTION? 
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Cornell University 


“descent and fall 


To us is adverse.” 
—Milton, Paradise Lost, Book ii. 


NE of man’s strongest fears is the fear 
O of high places and falling. The para- 
trooper standing in the door of his airplane 
waiting to jump, or the steel worker on the 
girders of a rising skyscraper are dramatic 
cases of fearful situations. Nearly all adults 
have felt apprehensive of height when look- 


‚ ing down from a tall building or into a 


gorge, when balanced at the top of a high 
ladder or preparing to jump from a high 
diving board. 

What is the basis of such a fear? Watson 
(1919), in his early studies of instinct in 
the newborn human infant, found that loss 
of support was a critical stimulus for fear. 
Loss of support for terrestrial animals 


© means falling, the sudden cessation of the 
. upward push of the ground on the skin, the 


cessation of the downward pull of gravity 
on the statoliths of the inner ear, and the 


1 This research was supported, in part, by a grant 
from the National Science Foundation. The re- 
Search was carried out at Cornell University with 
the exception of a few supplementary observations, 
Such as those on cocker spaniels, performed at 
George Washington University. 

?' The authors particularly wish to acknowledge 
their indebtedness to James J. Gibson. He has 
helped at every stage of the research presented 
here. The research has also benefited from the 
Comments made by the many individuals with 
Whom we have discussed it. We would like to 
mention especially the help given by Cornell gradu- 
ate studerits Thomas J. Tighe, Herbert L. Pick, Jr., 
Jesse Smith, Alan G. Hundt, and Elinor Wardwell 
and by George Washington University graduate 


) students Samuel Trychin and Alice B. Sheldon. 


Priority of authorship of this monograph was 
determined by the toss of a coin. 


simultaneous cessation of the stretching of 
the body's antigravity muscles. The stimuli 
are tactual, vestibular, and kinesthetic, Fall- 
ing, for most animals, is in fact dangerous. 
The changes in stimulation which inform 
the animal’ that he is falling accordingly 
arouse a variety of reflex postural reactions 
and a feeling of fear. 

For adult animals who move about only 
on the ground, the danger of falling is con- 
fined to certain locations which might be 
called "falling-off places.” At such a place, 
the level ground drops off to a lower level, 
making an edge, or cliff. The ability to de- 
tect a cliff by vision would be very useful, 
for it would provide the animal with the 
means of detecting a potential loss of sup- 
port. If vision provided a means of detec- 
ting a cliff, it could function to preserve ani- 
mals from falls. The theory of the evolu- 
tion of species should predict that such a 
discrimination might develop in terrestrial 
animals and that it would be effective by the 
time the animal was ready for independent 
locomotion. 

For visual detection of a drop-off, light 
to the animal’s eyes must provide informa- 
tion to differentiate the drop-off from the 
surface on which the animal stands; it must 
provide stimulation for an edge, and ideally 
for gradations of depth below the edge. It 
is a fact of optics that if two surfaces at 
different heights are textured or patterned 


.similarly, a difference in density of optical 


texture will be present in the light projected 
to the animal’s eyes (Gibson, 1950, 1958). 
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Figure 1 shows diagrammaticaly such a 
situation. 

The same situation provides a second kind 
of differential stimulation for depth dis- 
crimination if the animal moves. Head 
movements or a change in his position as 
the animal looks will produce motion paral- 
lax. The velocity of angular motion of tex- 
ture elements at the line of the optic array 
corresponding to the edge of the platform 
(or the animal's nose or feet) will be differ- 
ent from the velocity of elements of the 
surface below. Motion parallax (differen- 
tial velocity of elements in the array) will 
increase as the drop increases. There will 
be a velocity difference, then, between the 
ground and the surface below, which will 
characterize the relative depth of the sur- 
face below—the amount of the drop-off. 
This velocity difference produced by the 
animal's own movement is potentially a 
highly effective kind of information about 
the relative depth downward of a surface. 

Binocular parallax is another potential 
differential stimulus for depth. This is defi- 
nitely a cue for depth discrimination in hu- 
mans, since they have overlapping visual 
fields and conjugate eye movements. All 
animals with whom this research is con- 
cerned have some overlapping of visual 
fields (though degree of overlapping differs 
greatly among species), but the extent to 


p 


Textured Surface 


Textured Surface 


Fig. 1. If an animal stands on the raised floor 
on the left, with an identically textured surface 
below on the right, the light rays reaching his 
eye will differ in density, a finer density charac- 
terizing the surface farthest below the eye. 


which convergence and conjugate eye move- 
ments are utilized is largely unknown. 
(Duke-Elder, 1958; Walls, 1942). The 
other “cues” to depth are probably irrele- 
vant here. The informative value of accom- 
modation has been questioned under any 
circumstance.* Looking downward, vertical 
position in the field no longer comes in as a 
source of information, as it would looking 
straight ahead. Aerial perspective, bright- 
ness, etc., would be inoperative, unless the 
drop was very great indeed. 

But the two stimulus variables of density 
difference and velocity difference in ele- 
ments of the two surfaces are available for 
any animal with an eye to detect a drop-off. 
Whether most animals actually do so, at 
what age, and under what conditions, is the 
topic of this research. 

If the surface below the animal is literally 
untextured or homogeneous, there would be, 
presumably, no optical stimulation for sur- 
face perception. When there is no visible 
surface to descend on, the animal should 
not descend if his behavior is truly adaptive. 
Water (without ripples) might provide such 
a surface in nature, and tend to be avoided 
by terrestrial animals. It might, on the 
other hand, be approached by an aquatic 
animal, just as air (untextured light) would 
indicate a safe path for flying for a bird. 

The problem with which this monograph 
is concerned is the discrimination, by vision 
alone, of depth downward at an edge. Some 
of the questions to be answered are the fol- 
lowing: Is the discrimination present in 
animals of different species and different 
ages? Can it be detected by tendency to 
avoid a drop-off? Is the avoidance tendency 
greater, the greater the drop? What con- 
ditions or cues are actually operative in 
making the discrimination? And what is the 
role of visual experience in different species 
in promoting the discrimination? 


з Van Tuyl (1937) in a study of monocular per- 
ception of distance found that there was no indi- 
cation of an “immediate and familiar sensation 0 
distance,” and that the majority of Ss, when 
forced to rely on accommodation and convergence 
alone, could not achieve any consistent accuracy 
even after considerable practice. 


| 
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. HISTORICAL BACKGROUND 


Several methods used in the study of ani- 
mal behavior have taken for granted that 
an animal will discriminate and avoid a 
drop-off. The elevated maze is useful be- 
cause the rat does not jump off. The jump- 
ing stand requires that the animal gauge his 
jump to a platform, and that he avoid falling 
into a net below. But only a few studies 
have been made of the avoidance of the 
drop itself. We shall refer arbitrarily to 
such a discrimination as perception of depth, 
to differentiate discrimination of depth 
downward, or a drop-off, from perception 
of distance ahead. 

The earliest experimental study of this 
behavior appears to be Spalding’s, in 1875. 
He blindfolded a baby pig at birth, later 
put it on a chair and removed the blindfold. 
It “knew the height to require considering, 
went down on its knees and lept down” 
(p. 507). The implication was that the pig- 
let was able to gauge the force of the jump 
appropriate for the steepness of the drop 
correctly in spite of no previous visual ex- 
perience. Since infant pigs are ambulant 
almost at once after birth, the early maturity 
of this capacity is highly adaptive. 

Thorndike, in 1899, made a study of in- 
stinctive reactions in the newly hatched 
chick. His procedure included putting the 
chick on a pedestal a certain number of 
inches above a box containing other chicks. 
The question was: at what heights would 
the chick, motivated by the sight and chirp- 
ing of its fellows, jump down; and at what 
height would it refrain from jumping. At 
10 inches or less, the average chick 95 hours 
old jumped immediately; at 16 inches, 
he waited up to 3 or 4 minutes, and so on, 
until at 39 inches, the chick would not jump 
down. Thorndike (1899) concluded that 
“at any given age the chick without experi- 
ence of height regulates his conduct rather 
accurately in accord with the space-fact of 
distance which surrounds him” (p. 284). 

An experiment by Kurke, in 1955, made 
Use of a similar technique. The chick jumped 
from a platform of variable height to join a 
stoup of cheeping chicks. The platform was 
Taised to 21 inches at the start, and the chick 


was given a 30-second trial at this height. 
If it did not jump, the platform was low- 
ered by 2-inch steps until it did. Normal 
chicks 3 days old jumped at a mean height 
of 3.4 inches. Dark reared chicks 1 and 2 
days old would not leave the platform, but 
dark reared chicks 3 days old jumped at a 
mean of 24 inches. A group given “en- 
forced vertical experience" in the brooder, 
run at 10 days, jumped at a mean height of 
6.4 inches. These chicks were provided with 
a platform and ramps 15 inches high in the 
brooder. They were compared with a “re- 
stricted” group which had wire mesh just 
over their heads in the brooder. The re- 
stricted group jumped at a significantly low- 
er height. What conclusion should be drawn 
from this comparison is not clear for two 
reasons. The chicks provided with ramps 
and platforms had far more opportunity to 
develop motor coordination. Furthermore, 
the 30-second trial length was probably 
much too short. Thorndike’s research (and 
our own) showed that the chick often hesi- 
tates up to 10 minutes before going any- 
where. 

The depth discrimination of turtles pre- 
sents an especially interesting problem, since 
the same type of adaptation might not be 
expected of both land and aquatic species. 
Yerkes, working at the New York Zoologi- 
cal Gardens in 1904, made a study of the 
space perception of three species of tortoises, 
one aquatic (Chrysemys picta), one terres- 
trial (Terrapene Carolina. Linnaeus), and 
one which is both aquatic and terrestrial 
(Nanemys guttata). The turtle was placed 
in the middle of a board elevated 30, 90, or 
180 centimeters above a black net, and its 
time to leave the board measured. The three 
species reacted differently; the terrestrial 
turtle and the one of mixed habits failed in 
a majority of cases to leave the board at all 
during a 60-minute period when it was 
raised to a height of 180 centimeters, but 
all except one of 40 aquatic turtles came 
down. All three species, however, showed 
increasing hesitation with increasing height. 
Yerkes (1904) concluded that 
hesitation in the presence of the void increases as 
we pass from the strictly water forms to those 
which are land inhabiting . . . [and that] 
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total inhibition of the reaction, i.e., failure to crawl 
over the edge of the board in 60 minutes, appears 
at a much less height for the land species than 
for the waterland and water forms (pp. 20-21). 
According to Yerkes, the land turtles “mani- 
fested fear” of the heights. For the aquatic 
turtles, presumably, jumping off an edge 
would have been associated with landing in 
water, a place of safety. When blindfolded, 
these animals pushed off any height with no 
hesitation; but Т. Carolina, the land turtle, 
was inactive when blindfolded. 

Depth discrimination of rodents was re- 
ported on by Waugh, in 1910. The mouse 
was placed on a pedestal at varied heights, 
and its time to jump down measured. With 
two mice, he found a graded increase of 
time as the pedestal was raised from 4 to 18 
centimeters. The column supporting the 
pedestal was visible. He then changed the 
platform so that it hung from a higher sur- 
face, eliminating the pedestal, and placed a 
sheet of glass 4 inches below it but still 
above the floor. The animal did not per- 
ceive the surface of the glass and behaved 
as if the surface below it were that of the 
floor. The times were graded with distance 
when a board was lowered below the disk.‘ 

In 1932, Russell showed that rats on a 
jumping stand would gauge the force of a 
jump according to the distance of the target. 
The jumping stand involves discrimination 
of both depth downward and distance ahead. 
There is edge avoidance and gap jumping— 
a notorious conflict situation, requiring con- 
siderable persuasion to make the animal 
jump. Russell gave 10 trials at different dis- 
tances in chance order. Force and distance 
were reported to be directly related in the 
case of every rat. Albinos were somewhat 
inferior to pigmented rats (they “jumped 
short” and their curves did not rise as 


4 Waugh also tried a sort of obstacle test for 
perception of distance of objects ahead of it (the 
animal had to swerve to avoid the objects). Many 
errors were made in this situation. He noted that 
head movements “giving several points of view of 
the object, each from a different angle” seemed to 
accompany right choices, suggesting that motion 
parallax was operating, Other experiments with 
rodents on distance ahead are those of Robinson 
and Wever (1930) and Greenhut (1954). 


steeply). Monocular animals performed as 
well as binocular ones. As distance in- 
creased, so did “disinclination to jump.” 

The same procedure was applied by Lash- 
ley and Russell in 1934 to rats which had 
been raised in a darkroom. For a series of 
distances, force was nearly as accurately 
graded as it was in animals raised under the 
usual lighting conditions. The dark-reared 
animals were inferior in motor coordination 
(take-off and landing) but force was just 
as precisely graded as in normally reared 
animals. Lashley and Russell concluded that 
there is an innate mechanism by which the 
relative force exerted is immediately ad- 
justed to relative distance, and that dis- 
crimination of depth is not dependent on 
past experience. 

The conclusions drawn from this experi- 
ment have been criticized on several grounds. 
It has often been pointed out that the dark- 
reared animals had to be trained in the light 
to perform on the jumping stand, thus pro- 
viding opportunity for visual experience be- 
fore the depth judgments were tested. 
Greenhut and Young (1953) repeated Rus- 
sell’s experiment (although they added 
shock as an incentive), and concluded that 
“distance is not appreciated visually” and 
that, with a random order of presentation 
of distances, there was no correlation of 
force and distance. They also argued that 
jump force is a poor criterion because it is 
not always correlated with an accurate jump. 
Since they used shock and reported that 
their animals were “emotionally disturbed,” 
it is not unreasonable that Russell’s results 
were not duplicated. Their criticisms, how- 
ever, make confirmation by another tech- 
nique important. 

Surprisingly little experimental work has 
been done on space perception—especially 
depth downward—in large animals. War- 
kentin and Smith (1937) studied develop- 
ment of visual acuity in the cat, and re- 
ported that visual placing reactions of the 
forelimbs occurred at a mean of 25 days. 
The visual placing is presumably a sign of 
distance perception; whether it is related 
to discrimination of a drop downward re- 
mains to be seen. Tt is interesting to note, 
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with regard to this response, that Riesen 
and Aarons (1959) found the visual placing 
reaction absent in cats reared in the dark 
until 6 weeks of age. Dark-rearing may, 
therefore, prevent normal maturation of the 
capacity for depth discrimination in the cat, 
although it appears not to prevent depth dis- 
crimination in the rat and the chick. Con- 
firmation of such a differential result is 
needed. 

The primates, at least at a mature stage, 
presumably have good discrimination of a 
visual drop, but there has been little develop- 
mental work on either apes or human chil- 
dren. Riesen's dark-reared chimpanzees (Rie- 
sen, 1950) were slow in learning to avoid 
the approach of a striped disk which gave 
an electric shock if it touched them, but no 
Systematic test of visual depth discrimina- 
tion was made. 

There are no studies at all of perception 
of a drop-off in human infants and young 
children. In fact, only a few studies of dis- 
crimination of distance ahead exist (Denis- 
Prinzhorn, 1960; Johnson & Beck, 1941; 
Updegraff, 1930; and, by inference, Cruik- 
Shank's 1941 study of size-constancy in in- 
fants). In human adults, it is interesting 
to note a result of an experiment with air- 
borne trainees (Windle, Ward, Nedved, & 
Nathan, 1956). Trainees were required to 
learn proper aircraft exit techniques by 
jumping from a tower that permits a free 
fall of 8 feet before the jump is snubbed. 
Different groups were required to make the 
jumps from each of three heights (18 feet, 
26 feet, and 34 feet). Although the physical 
fall experienced by each group is the same 

_ and only the perceived visual height differs 
| between groups, a cumulative curve, day by 
day of training, showed that more satisfac- 
tory jumps were made, the lower the height. 
- Hesitation (and poor technique) in jump- 
ing increased the greater the height of the 
platform from the ground. 
Clearly, there are many gaps in our 
- knowledge of behavior when an animal or 
a human infant or child stands at the verge 
_ Of a sheer drop. Whether, how soon, and 
| Бу what optical information he avoids the 


* Й 
j drop are questions to be answered. 


APPARATUS AND PROCEDURE 


To conduct a comparative study of the 
visual discrimination of depth downward, an 
apparatus fulfilling the following require- 
ments is necessary: it must permit con- 
trol of all cues other than optical ones, no 
pretraining should be required, and sub- 
stantially the same apparatus should be 
adaptable for testing many different species, 

Of the kinds of apparatus used in the ex- 
periments reviewed in the introductory sec- 
tion, none satisfied these requirements com- 
pletely. For instance, the force of jump 
measure used by Russell (1932) required ex- 
tensive pretraining; it may have allowed the 
subject to utilize some cues other than visual 
ones, and it is adaptable only for organisms 
which can be trained to jump. The graduated 
series of heights used by Waugh with mice, 
Thorndike and Kurke with chicks, and Yerkes 
with turtles, can be used for a variety of 
animals, but it might be dangerous for sub- 
jects of poor locomotor ability. Further- 
more, some nonvisual cues (e.g., echoloca- 
tion) might be utilizable. And finally, a 
measure of refusal to jump down, alone, 
might be contaminated in some experiments 
by side effects of previous rearing condi- 
tions (e.g., dark-rearing or isolation, which 
increase “emotionality’—Gibson, Walk, & 
Tighe, 1959). 

The apparatus designed for the present 
experiments, which we named the “visual 
cliff,” uses the principle of a drop-off or 
graduated heights, but gives the animal a 
choice between a short drop-off on one side 
of a center board and a long drop-off on the 
other side. A terrestrial animal, if it de- 
tected the difference, should prefer the short 
drop-off at a safe depth to the long drop- 
off at a dangerous depth. To eliminate non- 
visual cues that might permit detection of 
the difference, such as auditory, olfactory, 
or temperature differentials’ from near or 
distant surfaces, a sheet of glass was in- 
serted under the center board where the 
organism was placed, so as to extend out- 


5 Of these cues, the only one reliably demon- 
strated to be a cue for distance is echolocation 
(Griffin, 1958; Riley & Rosenzweig, 1957). 
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ward across both the shallow (“safe”) drop 
and the deep ("dangerous") one. The glass 
was placed over the shallow side as well as 
the deep side to equate stimulation pro- 
duced by the glass itself, if any (e.g., reflec- 
tions), and to equalize tactual cues for loco- 
motion. 

Patterned material (wallpaper, linoleum, 
etc.) could be placed directly under the glass 
on the shallow side and on the floor below, 
at any desired distance, on the deep side. 
Information in the light coming to the ani- 
mal's eye from the patterns on either side, 
in combination with stimulation produced 
by the animal’s own motion and ocular 
equipment constituted the stimulus basis for 
visually differentiating the two sides. Fig- 
ure 2 shows diagrammatically the situation 
created by the apparatus. 

In summary, the subject was allowed to 
descend to either an optically shallow or to 
an optically deep surface from a center 
board between the two surfaces. If the sub- 
ject could not or would not locomote, of 
course no data could be obtained. 

The general procedure was as follows: 
The animal was placed on the center board, 
normally by hand, although the first experi- 
ments with rats placed the animal on the 
board in a small box to eliminate handling 


Glass 
Plote g 


Textured 
Surface 


Textured Surface 


Fig. 2. Diagram of the visual cliff, in cross sec- 
tion. (The animal is placed on a raised board in 
the center. On the left side is a patterned surface 
only a short drop below his feet—the shallow side 
—and on the right is the same patterned surface 
placed much farther below—the deep side, Glass 
extends across both sides from the base of the 
center starting board.) 


bias. To equate for position preferences 
half of the animals were started from one 
end of the board, half from the other. Ob- 
servation periods varied for each species 
but were adequate to permit descent from 
the board. Descent to the glass surface was 
whenever possible left to the natural ex- 
ploratory tendency of the subject. At the 
end of the observation period, the animal 
was removed and the board and glass sur- 
face cleaned with a damp sponge. To equate 
odor cues the sponge was used on the glass 
surface over both the shallow and deep sides 
regardless of the side of descent. 

Controls will be described later that were 
run to make sure no extraneous stimulation 
might cause a preference for one side or the 
other. 

Three models of the visual cliff were 
made. The first, a relatively crude model, 
served to explore this method of testing 
depth discrimination. A second one was 
then built which permitted more precise con- 
trol of stimulus factors. A third one was 
constructed for testing larger animals. 


Cliff Model I 


This apparatus, shown in Figure 3, consisted of 
four ring stands with clamps that supported 
two sheets of glass (24" x 32") parallel to the 
floor and 53” above it. An unpainted board 24” 
long X 4" wide X 3" high divided the glass into 
two equal parts. Between the sheets of glass on 
one side of the board (the "shallow" side) was 
inserted a sheet of patterned wallpaper of 8" 
green, white, and grey checks. The same wallpaper 
was placed on the floor and on the walls below 
the glass surface. Above the floor of the glass 
surface, cheesecloth surrounded the apparatus to 


Fig. 3. Photograph of visual cliff; Model 1. 
(The experimental testing situation is shown to 
the left and a control condition where the texturi 
surface is placed under the glass on both sides О 
the center board is shown to the right. The wall- 
paper pattern on the shallow and deep sides 18 
the same.) (Reprinted from Science by permission) 
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shield the experimenters partially from the sub- 
ject’s view. The apparatus was placed in a corner 
of the room and the subject descending from the 
board toward the shallow side went toward the 
observers. All animals used in this apparatus were 
placed on the center board in a box to avoid any 
handling bias. The apparatus was illuminated from 
above by fluorescent lighting and additional illumi- 
nation was supplied the deep side from below to 
equate the two fields in brightness. Brightness 
readings are shown in Table 1. 


Cliff Model II 


The apparatus was modified in the second model 
to eliminate visible supports by hanging the entire 
apparatus from ceiling supports. The modifications 
also made for better control of illumination, less 
reflection from the glass, and provided for a con- 
tinuous visible texture on the deep side. This 
apparatus permitted much more precise experi- 
mental control of the optical stimulation. 

The apparatus, which is schematically shown in 
Figure 4, consisted of a hollow enclosed box with 
a floor of glass. The walls were of 2” pine and 
the floor was two 16" X 20" pieces of glass; the 
outside dimensions were 32" long, 20" wide, and 
94” high. The glass was supported by right angle 
aluminum $" X 4” fastened to the side of the 
walls and protruding under them. Ап additional 
piece of aluminum provided support for the tex- 
tured surface to be placed directly under the glass 
on either side of the center board. These textured 


floor pattern 
F ееп through glass 


Fig. 4. Drawing of visual cliff, Model II. (The 
box containing the cliff is hung from the ceiling. 
The animal stands on a center board, looking to- 
ward the deep side. The shallow side is covered 
With a patterned material under glass. The inden- 
tical patterned material covers the floor below, as 
far as the animal's vision can range from the cen- 
ter board. The glass extends out from the board 
over the deep side as well.) 


patterns were fastened to masonite pieces 16" x 20" 
that slid under the glass, The center board meas- 
ured 18" long, 3$" wide and its height could be 
varied from $" to 32”. Both the box and the 
center board were painted a flat grey. 

Two small bulbs fastened to the side of the 
apparatus with reflectors above them supplied ad- 
ditional illumination, if required, to the textured 
surface on the floor. The textured surface on the 
floor covered enough of the floor of the room away 
from the center board to provide a continuously 
textured surface on the deep side. When the 
height of the apparatus was set at 10" the textured 
material extended 5' away from the center board 
and l' underneath it; at the 25" height the tex- 
tured surface was also taped 15" up the far wall 
of the room itself which was 6' from the center 
board. 

The box containing the apparatus hung from two 
parallel ceiling beams of 2" X 4" pine, 100" long, 
placed 32" apart and 84" above the floor. The box 
hung from the ceiling beams by two adjustable 
sliding wooden supports measuring 2" X 34” x 80", 
one fastened at each end of the cliff apparatus and 
sliding into grooves in the ceiling beams. The 
adjustable supports were held at a desired height 
by the insertion of a large nail through a hole in 
the board above the grooves. 

The ceiling above the apparatus consisted of a 
sheet of white cotton cloth tacked to the underside 
of the ceiling beams across the width of the room 
and hanging 4' down toward the floor. А piece 
of 22" X 28" cardboard was attached to each sus- 
pending support. The cloth and the cardboard 
served to diffuse the illumination from the ceiling 
of the apparatus and, in addition, to shield the 
details of the apparatus and the experimenters 
from the subject’s view. 

The illumination from above was supplied by 
two 150-watt bulbs hung at each end of the room 


TABLE 1 


BRIGHTNESS READINGS TAKEN WITH 
Weston Master III 


Cliff Model Reading 


I (with small checked pattern) 
Shallow side 
Deep side 


II (34" red checks) 
Shallow side 
Deep side 


III (green and white tile) 
Shallow side 
Deep side 


=. 
с 


бобо 


3.2 
3.2 


the Weston reading by 4 gives an ap- 


—Multipl: 
Note.—Multiplyi measurement (a reading 1.6 is about 


proximate foot- 
6.4 ft-c.). 
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above the diffusing cloth. Aluminum foil was hung 
over a metal pipe 27" above the main supporting 
beams and fastened to them so as to make a tent- 
shaped roof. The aluminum foil concentrated the 
light so that the illumination came from the ceiling 
above the apparatus and little could be reflected 
from elsewhere in the room, The foil also served 
to diffuse the light evenly across the ceiling so 
that no single bright area or "hot spot" was re- 
flected in the glass of the apparatus below, On the 
whole, the diffusion of light from the ceiling 
served to illuminate the apparatus homogeneously 
and to minimize reflections in the glass. 

Experimenters observed from behind the card- 
board at the shallow side. 


Cliff Model III 


This model of the visual cliff was designed to 
test larger animals and human infants. A table 
was constructed of 2" X 4" pine, measuring 8' 
long, 6' wide, and 40" high. Supporting legs were 
placed at each corner and in the middle of each 
long side where an additional supporting cross 
beam was also used. Two large pieces of Herculite 
glass 4' x 6' X 8" formed the surface of the 
table. Under the shallow side а 46" х 684” x 3" 
piece of composition plywood i" below the glass 
was placed to support a textured surface, an irreg- 
ular green and white pattern of linoleum tile that 
matched the floor. The same tile pattern was laid 
over the center board which measured 6' X 114” x 1”. 

The cliff table was entirely surrounded by an 8" 
high board of 2” pine to protect the subject from 
accidently falling off the cliff. 

Illumination was supplied by fluorescent lighting 
directly over the apparatus. The fluorescent lights 
were covered with brown wrapping paper to diffuse 


. glass over 
| patterned 
surface 


Fig. 5. Drawing of the visual cliff, Model III. 
(This cliff differed from Model II chiefly in size 
and strength. Because of its weight it was sup- 
ported by legs. An infant is starting from the 
center board toward the shallow side. The entire 
floor of the room is covered with the checkered 
linoleum identical with that on the cliff.) 


the lighting more evenly over the ceiling and to 
minimize reflection of the lights from the glass. 
This model of the visual cliff is illustrated in 
Figure. 5. 


Basic EXPERIMENTS AND 
VALIDATION OF THE CLIFF 


This section will be devoted to a report 
of basic experiments with the visual cliff 
and a number of control experiments run 
to provide validation for the technique. The 
purpose of all the control experiments was 
to demonstrate that the animals were re- 
sponding to the visual cues provided by the 
textured surfaces at different depths below 
the animal. The possibility of choices de- 
pending on other factors, such as brightness, 
reflections, and position of the experimen- 
ters had to be ruled out. The experiments 
included here also varied textures, heights, 
and the apparatus design itself. The sub- 
jects for these experiments were all hooded 
rats (Long-Evans stock) reared in the lab- 
oratory colony. Hooded rats were chosen 
for their availability in large numbers, their 
small size which made control of apparatus 
and environmental factors practical, and the 
fact that their natural exploratory drive 
solved the motivation problem. A rat was 
never run more than once unless it is spe- 
cifically stated in the experiment. 


Original Experiment 


The first experiment? was run on the original 
apparatus (Model I). The measures taken were 
side chosen on first descent, time spent on either 
side, and number of crossings back and forth (if 
any). The textured material was directly under 
the glass on one side (shallow), 3” below the sur- 
face of the center board. On the other side 
(deep), the textured material was on the floor, 
53” below the surface of the center board. The 
animal was placed on the center board and then 
observed for 5 minutes, A second group of ani- 
mals was run under a control condition in which 
the textured paper was placed directly under the 
glass on both sides. The side which was deep for 
the experimental group is referred to similarly 
(Table 2) for this group as well, for purposes of 
comparison. If the animals had a preference for 
one side or the other, due to irrelevant factors 
such as the position of the experimenters, the 


€ This experiment was reported in Science, June 
1957. 
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control group's behavior should reveal it, The 
experimenters actually stood closest to the shallow 
side. 


Results. It can be seen in Table 2 that 
the experimental group tended strongly to 
descend on the shallow side, did not cross 
back and forth at all, and passed a majority 
of its time on the same side. The prefer- 
ence for the shallow side is especially con- 
vincing in view of the closeness of the ex- 
perimenters to that side. 

The control group, on the other hand, 
showed no preference for either side in first 
descents, did in a number of cases cross 
back and forth, and spent more time on the 
side which was deep for Group E. 

This experiment led us to the conclusion 
that hooded rats will avoid a visual cliff— 
a long visual drop-off as compared with a 
short one. It should follow, therefore, that 
animals placed on a center board with the 
textured surface far below the glass on 
both sides should be hesitant in descending 
either way. We next set up such an experi- 
ment. The same rats were run under two 
conditions: first, in the control condition 
already described, with the checked pattern 
directly under the glass on both sides; and 
second, in a new control condition with the 
Checked pattern far below the glass on the 
floor on both sides. The animals were 
placed on the center board and observed 
for 3 minutes. The number of animals de- 


TABLE 2 


COMPARISON OF EXPERIMENTAL AND CONTROL 
GROUPS ON THE VISUAL CLIFF APPARATUS 


Experimental | Control 
group group 
(N = 29) |(N = 10) 
Percentage descending 
on shallow side 88.5 50.0 
Mean number of cross- 
ings in 5 minutes 0.0 1.70 
Percentage of time on: 
Shallow 76.0 24.1 
Deep 10.0 61.5 
On Board 14.0 14.4 


TABLE 3 


NUMBER ОЕ DEscENTS AND LATENCY ОЕ DESCEND- 
ING UNDER Two Controt Conpirions, BOTH 
SIDES IDENTICAL 


(N = 11) 
Number of | Median 
Control condition animals latency 
descending | (seconds) 
Pattern directly under 
glass 9 (81.8%) 9 
Pattern on floor 4 (36.4%) 120 


scending in 3 minutes and the median time 
to descend are presented in Table 3. 


As would be expected, the animals were 
much slower to descend when the patterned 
surface was on the floor (53" below) than 
when it was directly under the glass. The 
majority of animals did not descend at all 
in this condition, although they were being 
run for the second time. 


Comparison of Different Patterns and 
Depths of Visual Surfaces 


Following the first experiment, a new ap- 
paratus was built ( Model II), as described 
in the previous section. 


The animal's entire field of view was controlled 
by this apparatus. The old wallpaper pattern was 
replaced by three other patterns, in turn. One 
was a fine textured pattern in grey and white with 
rather low contrast. A second was a coarse tex- 
tured pattern (1" squares) with higher contrast. 
The third was actually without pattern or texture 
—a light grey homogeneous surface. As in the 
previous experiment, the material was placed di- 
rectly under the glass on one side and at some 
distance below on the floor on the other side. The 
center board was 4” high. Since the new appa- 
ratus could be raised and lowered, the fine texture 
was tried at two depths, 25” and 10" (Conditions 
A and B in Table 4). The coarse texture was 
placed in the basic experiment at 10” (Condition 
D). A control experiment was also run with the 
fine texture, with the material directly under the 
glass on both sides (Condition C). This was a 
replication of the earlier control experiment with 
the new apparatus and pattern. The other control 
(pattern on the floor below on both sides) was run 
with the coarse texture at a depth of 10” (Condi- 
tion E). 
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TABLE 4 


PLACE or DESCENT AND MEDIAN LATENCY FOR ApuLT Нооркр RATS WITH 
DIFFERENT TEXTURES AND DEPTHS OF VISUAL SURFACE 


Shallow side Deep side No descent 
Condition N 
% Latency % Latency % 
(seconds) | (seconds) 

А. Fine texture 25” deep side 22 95 12 0 5 
B. Fine texture 10” deep side 12 100 6.5 0 0 
C. Fine texture 0” both sides 16 56.7 7 43.8 7 0 
D. Coarse texture 10” deep side 15 93 17.5 0 7 
E. Coarse texture 10” both sides 20 30 33 5 40 65 
F. Untextured 10” deep side 20 35 1.5 40 10 25 

G. Coarse texture 10" deep side, 
far side brighter 10 100 5 0 0 

H. Reflection eliminated 10" deep 
side, coarse texture 20 90 10 0 
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The untextured homogeneous grey material was 
placed directly under the glass on one side, and 
10" below on the other (Condition F). This con- 
dition provided a further control for extraneous 
cues, If any stimuli other than visual stimuli for 
depth were determining the preference for the 
shallow side, the preference should persist in this 
condition, If not, the two sides should be chosen 
about equally or the animal should remain on the 
board. In these experiments, the animals were 
placed on the board and then observed for 3 min- 
utes, Median latency and place of descent were 
recorded, No animal was run in more than one 
condition. 

Results. In Table 4 it can be seen that 
both textures, fine and coarse, were effec- 
tive. The animals descended over 90% to 
the shallow side, and none descended to the 
deep side (Conditions A, В, and D). A 
10” visual drop-off was just as effective for 
this preference as a 25" one. 

When the fine texture was directly under 
the glass оп both sides (Condition C), the 
preference disappeared, as it had in the pre- 
vious control experiment. About 44% of 
the animals descended to the deep side, as 
contrasted with none in the experimental 
groups. 

When the coarse texture was placed 10” 
below the glass on both sides (Condition 
E), 6596 of the animals refused to descend 
at all in the 3-minute interval. 

When the material under the glass was 
untextured (Condition F), 40% of the ani- 


mals chose the deep side, 35% the shallow 
side, and 25% refused to leave the board. 
Since there was no visible grain in the sur- 
face (to the human eye, at least) this was 
not surprising. The prediction that there 
should be no preference for either side with 
a textureless surface was confirmed. 


Brightness Control 


When the main source of illumination was from 
above, the closest surface to the light source, 
which was the shallow side of the apparatus, te- 
ceived the most illumination, To balance bright- 


—William Vandivert 

Fig. 6. One of the control experiments showing 
a hooded rat on the board of Cliff Model п. 
(Plain, untextured grey paper lay under the glass 
on both sides. The grey surface to the right of 
the animal is actually 10" below the glass; that 
on the left of the animal is immediately below it.) 
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nesses it was necessary to add some illumination 
to the deep side. The procedure for each type of 
apparatus and each experiment was to try to bal- 
ance the brightnesses so as to make them as 
nearly equal as possible. The ordinary brightness 
readings for each apparatus have been indicated 
in the apparatus section. But brightnesses were 
not always exacily equal, particularly for an 
apparatus like the large visual cliff (Model III) 
which was influenced by variations in daylight 
illumination. What is the effect of an artificial 
manipulation of brightnesses? 

Extra illumination was added to the deep side 
of the standard setup (Model II) so that the 
Weston Master III reading was 0.8 on the shallow 
side and 6.5 on the deep side. The deep side was 
10" in depth and the à" checkered pattern was 
used. Results appear as Condition G in Table 4. 
Ten hooded subjects were used and all 10 (100%) 
descended on the shallow side. The original ap- 
paratus had slightly greater brightness on the 
shallow side (1.6 reading) as compared to the 
deep side (1.4 reading). 

It seems clear that a small variation in bright- 
ness has no influence on the behavior observed in 
this apparatus, since the preference for the shal- 
low side existed despite added brightness on either 
side. 


Reflection Control 


Although the controls first described seemed to 
meet any criticism which occurred to the authors, 
it was decided to make a further effort to elimi- 
nate reflections from the glass. It might conceiv- 
ably be argued that the reflections were unequal 
on the two sides since one was lighted only from 
above and one from both below and above. If all 
the light came from below in both cases, there 
should be no reflections at all. The lighting was 
therefore rearranged as follows. Two 17" fluores- 
cent bulbs (15-watt, 1” diameter) were placed 
under and along the center board, parallel to one 
another. The textured surface on the shallow 
side was lowered to 2" below the glass, so that 
its illumination came from the fluorescent bulb on 
its side. The textured surface on the deep side 
was on the floor, 10" below the glass, as before 
and was illuminated by the fluorescent bulb on its 
side, No other illumination was used. 

A new center board, wider than before, was 
introduced in order to cover completely the lights 
below. It was covered with the same patterned 
material used for the textured surfaces. It was 
only 13” high (above the glass), since the shallow 
surface was 2” below the glass. This meant that 
the depth from the top of the board to the tex- 
tured surface was 38” on the shallow side and 
113” on the deep side. The fact that the board 
was only 14” above the glass meant that the ani- 
mals could touch the glass without coming down 
—ie, could feel an equivalent surface on both 
sides. This tactual cue was unavoidable with the 


lighting coming from below, since the surface was 
already 2" below the glass and a greater height 
might inhibit the animals from descending on the 
shallow side as well as the deep. Since the tactual 
cue worked against the result predicted if the 
animals responded to the visual depth cues, it 
could not prejudice the results in that direction. 
(It would have the effect of equalizing choices of 
the two sides.) 

The underneath lighting was very effective in 
eliminating reflection; a human observer was not 
aware of the presence of the glass at all. 

Twenty animals were run on this setup (Con- 
dition Н їп Table 4). Of these, 90% went to the 
shallow side, and 10% to the deep. The relative 
visual depth of the two sides was therefore effec- 
tive in creating a preference with reflection elimi- 
nated, even with tactual cues available that worked 
for equality. 


Threshold Determination 


The fact that a 10" depth was just as effective 
as a 25" depth led us to wonder at what point 
the drop-off would cease to be effective, The ob- 
jective of this experiment was a psychophysical 
curve showing the relationship between depth and 
descent behavior. Instead of the two-sided cliff, 
which makes the animal's choice a relative one, it 
was decided to use a one-sided cliff with grad- 
uated depths. The animal was placed on a board 
4" above the glass; the textured surface was di- 
rectly under the glass, or at varied depths below 
it. The rat was observed for 3 minutes and 
scored simply on descent or no descent. 

Since previous animals, with the coarse (a" 
checks) texture, had avoided the deep side almost 
100% when the pattern was 10" below the glass, 
this depth was the greatest used. The other 
depths (below the glass) were 0, 2, 4, 6, and 8 
inches. Since the animal was placed on a 4" board, 
the drop-off from his feet was actually 4, б, 8, 
10, 12, and 14 inches. 

The board on which the animal was placed was 
covered with the same checked pattern as that on 
the surface below. The center board was inserted 
in the box just as it had been previously, but a 
back wall was provided so that the animal could 
look down on only one side. 

The effect of the glass, as compared with a no- 
glass situation, was also tested in this experiment. 
Each of the six depth steps was tested with glass 
in the usual position and without the glass. Each 
animal was run twice in the experiment, once on 
the no-glass and once on the glass condition but 
with a different depth step on the two runs, There 
were 120 animals. Each step was therefore tested 
40 times (40 different subjects), 20 times with 
the glass present and 20 without. These groups 
were further divided into first runs and second 
runs, so that first and second runs were repre- 
sented equally at each step, and in the glass and 
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no-glass conditions. Time of day of running was 
also equated for the groups. Half the animals 
were males and half females. They were dis- 
tributed equally through the groups. 

Results. Table 5 shows the percentage of 
animals descending at each of the six depth 
steps. The combined (overall) percentage 
represents results on 40 animals for each 
step. There is an unbroken slope from zero 
depth (between glass and pattern) and the 
10^ depth. At zero depth 72.5% of the ani- 
mals descended. Figure 7 shows the curve 
plotted from these percentages. It drops 
very steeply from zero depth to the 6" 
depth, where only 10% of the animals came 
down. This point seems comparable to a 
threshold, since there is a marked change 
in the slope of the curve here. Up to this 
point, however, the curve falls in a straight 
line: the steeper the drop-off, the fewer the 
descents. 

Both the glass and the no-glass conditions 
yielded a steadily falling curve, showing 
percentage of descents to be a function of 
degree of depth. There is not much differ- 
ence between the percentages for the two 
conditions. At those depth steps where the 
groups could be compared by chi square 
(they could not where any frequency was 
zero), two proved to be barely significantly 
different (p < .05 for both differences, 
without a correction), the 2” step and the 
4” step. More animals descended when 
there was glass below them. This is not un- 
reasonable for there are some cues provided 


TABLE 5 


PERCENTAGE OF ANIMALS DESCENDING AT EACH 
DEPTH ACCORDING TO PRESENCE OR ABSENCE OF 
Grass, First or SECOND Run, SEX OF ANIMAL, 
AND WITH GROUPS COMBINED 


Depth 


Glass 

No glass 

First run 

Second run 
Males 

Females 
Combined groups 


by the glass to indicate that it is a surface 
(e.g. reflections, echolocation). 

A comparison of the first and second runs 
indicates a greater tendency to descend at 
five of the depth steps on the first run. The 
two runs were done on separate days, but 
despite this, the animals’ exploratory drive 
appeared to have diminished. This trend 
was investigated further in another experi- 
ment. There was no consistent difference 
between males and females. 

This experiment permits the conclusion 
that, for the hooded rat, the tendency to 
avoid a drop-off is a simple monotonic func- 
tion of the degree of depth or steepness of 
the drop-off, up to a limit of no descent. 


Effect of Repetition 


Since the foregoing experiment appeared 
to indicate some diminution in number of 
descents on a second run, an experiment 


70 ——— Combined groups 
nope ---- With glass 
x —— Without gloss 


Percentage of animals descending within 3 minutes 


о 2 4 6 8 10 
Depth of textured surface below center 
^ board in inches 


Fig. 7. Percentage of animals descending at each 
of six depth steps, with a one-sided cliff. (Ami- 
mals were run at each depth with glass below 
the board, and without glass.) 
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TABLE 6 


PLACE OF DESCENT AND LATENCY AS A 
FUNCTION OF REPEATED TRIALS 
(N = 10) 


Day | Day | Day | Day | Day 
1 2 3 4 5 


Percentage to deep | 0 | 10 0 0 0 
Percentage to 


shallow 100 | 90 | 70 | 60 | 40 
No descent 0 0 | 30 | 40 | 60 
Mean latency in 

seconds 9,1) 48.9) 72.9) 98.2/138.3 


was designed to test the effect of repetition. 
The double cliff (Model II) was used, with 
a coarse pattern and a depth of 10” on the 
deep side. Subjects were 5 male and 5 fe- 
male adult hooded rats. Each was run once 
daily for 5 days, Time and place of descent 
were recorded. The animal was removed 
at the end of 3 minutes and “no descent” 
recorded, if it remained on the board con- 
tinuously. 

Table 6 shows clearly that there is no 
trend toward exploring the deep side as 
trials are repeated. But there is a tendency 
for the animals to remain longer on the cen- 
ter board—to go nowhere. Latencies in- 
creased consistently, as did number of no 
descents. 

Depth discrimination shows no change as 
trials are increased. The result of repeti- 
tion in this species is extinction of a gen- 
eral tendency to explore and hence to de- 
scend from the center board. 


CoMPARATIVE EXPERIMENTS 


The visual cliff was designed to permit 
comparison of the behavior of different spe- 
cies and ages of animals in depth discrimi- 
nation. While some alterations had to be 
made in adapting the apparatus for some 
species, a common visual stimulus was pres- 
ent, and the same procedure was used for 
all. The slight alterations themselves are in- 
formative and will be taken up as each spe- 
cies is discussed. Depth discrimination of 
the following animals was observed: rats 


(hooded and albino, infant and adult), 
chickens (the chick and the adult chicken), 
turtles, lambs, kids, pigs, dogs, cats, mon- 
keys, and human infants. This section will 
discuss the general depth discrimination of 
each species, including relevant observations 
of behavior. The manipulation of experi- 
mental variables will be taken up in other 
sections. 


Hooded Rats 


The ability of the adult hooded rat to dis- 
criminate depth has already been described 
in the section on basic experiments. This 
section will describe the behavior of the in- 
fant hooded animal and take up observa- 
tions of the behavior of the rat that are rele- 
vant to species comparisons and to replica- 
tion of these experiments by others. 

The infant hooded rat (27-30 days old) 
discriminates depth approximately as well as 
the adult hooded rat. The data are pre- 
sented in Table 7. Of 34 subjects run on 
the standard height conditions, only two 
came down on the deep side. The infant 
rats showed, on the other hand, a marked 
preference for that side when a textured 
pattern was put directly under the glass 
there (the “zero-inch” condition). The ani- 
mal's descent in this case was away from 
the experimenters. The cliff was avoided, 
then, by the experimental group, despite a 
tendency for animals to go in the opposite 
direction in the control situation. 

The latency data for adult or infant 
hooded rats show characteristic behavior for 
this species. Subjects rarely descended from 
the center board with a latency of more than 
1 minute, and less than 1% descended with 
latencies of more than 2 minutes. Observa- 
tion periods at first were 5 minutes but were 
cut to 3 minutes. 

The height of the center board is critical 
for the rat. An experiment with young rats 
will illustrate this point. In the experiment 
with the fine texture and 25" height re- 
ported in Table 7, the board was set at 23^ 
for these young animals. Of the 14 animals 
that descended from the board, one came 
down on the deep side and only 3 crossed 
to another side during the 3-minute obser- 
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TABLE 7 3 
BEHAVIOR or INFANT Ноорер Rats 27-30 Days OLD ON THE VisUAL CLIFF 
Shallow side Deep side Y 
Age N Е % no 
Condition (days) % Median % Median | descents 

descents latency descents latency 

(seconds) (seconds) 
Small pattern 53” deep side 26-27 14 93 23 0 7 
Small pattern 0” both sides | 26-27 16 19 23 75 17 6 
Fine texture 25” deep side 28-40 18 72 25 6 2 22 
Coarse texture 10” deep side | 32-33 22 17 29.5 5 8 18 
Coarse texture 0” both sides | 25-27 27 30 14.5 52 9.5 18 


vation period. A prior experiment had been 
performed with 9 hooded subjects 28 days 
old and a 14” board. Here, 6 descended on 
the shallow side and 3 on the deep side and 
8 of the 9 subjects crossed the board to the 
other side. The median latency at the 23^ 
height is 25 seconds and at 14” it is 3 sec- 
onds. Both the tendency to cross and the 
latency differences are significantly different 
(p < .05). This experiment shows that, 
for the rat, the board must be high enough 
to induce some disinclination to jump down 
so that predominately visual cues will be 
used for descent; otherwise this animal will 
tend to respond on the basis of tactual and 
kinesthetic cues. 

However, with a given height of the cen- 
ter board, an increased drop to the floor be- 
low may strengthen avoidance of the cliff. 
At the 53" depth a 13" board was used and 
9396 of the infant rats descended to the 
shallow side, none to the deep. Only 1 of 13 
subjects crossed to the deep side. But at the 
25" depth with an identical 13^ board, 67% 
went to the shallow side and 8 of 9 crossed 
to the deep side. Thus the low board elicited 
more crossing behavior at the 25" depth 
than at the 53" depth (p < .01), and a 
weaker preference for the shallow side. 

The experiments in Table 7 at a 10" depth 
used a 3" board which was lowered to 21^ if 
the young animal did not descend in 3 min- 
utes. Under these conditions almost all sub- 
jects came down to the shallow side and 
few crossed. 


In summary, the hooded rat, infant or 
adult, has effective visual depth discrimina- 
tion but to demonstrate its maximum acute- 
ness, the animal must be prevented from 
using tactual cues by manipulating the height 
of the center board from which it descends. 


Albino Rats 


It is generally considered that the vision 
of the albino rat is poorer than that of the 
hooded rat. But how does this affect depth 
discrimination? With adequate visual cues 
depth discrimination of albinos and hooded 
might be equivalent. Using the Model II 
apparatus with the 2" squares and the deep 
side 10" below, 94% of 16 adult albino ani- 
mals descended to the shallow side, 676 to 
the deep side. No animal crossed the center 
board to the deep side, behavior indistin- 
guishable from that of the hooded rats in 
this setup. 

Infant albino subjects were also tested on 
the original apparatus (Model I) at the 53^ 
height with a 13" board and observed for 
5 minutes. These results are shown in Table 
8. While the difference between controls 
and experimentals is not significant, the re- 
sults are in the expected direction and, 
though the albinos look poorer in depth dis- 
crimination, the difference between the al- 
bino subjects and the hooded subjects 1n 
Table 7 is not significant. Whether the large 
percentage of no descents among the litter 
mates of the control animals is due to the 
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TABLE 8 


BEHAVIOR OF INFANT ALBINO RATS 26-29 


Days OLD on THE VisuaL CLIFF MODEL I 


Shallow side Deep side 
% no 
Group N % Median % Median descent 
descents latency descents latency 
(seconds) (seconds) 
Experimental 17 41 67 18 23 41 
Control 14 29 42 7 12 


depth situation is difficult to determine, 
though it is a significant difference. The 
tendency is for latencies of descents toward 
the experimenters (the shallow side) to be 
higher than those toward the deep side, 
away from the experimenters. The experi- 
mental animals may have stayed on the cen- 
ter board rather than descend to the shallow 
side, toward the experimenters. 

The most general statement that can be 
made from these data about the albino rat 
is that with adequate visual stimulation the 
behavior is similar to that of the hooded 
animal. 


Baby Chicks and Adult Chickens 


The baby chick is an interesting animal 
to use on the visual cliff because it can be 
tested within a few hours of birth; there 
is little possibility of learning to avoid depth 


if depth discrimination is effective so soon. 
The chick is perforce a more visual animal 
than the rat; it has no vibrissae that give 
information about the environment ahead 
or forepaws for exploring it. The animal 
jumps down from a pedestal, committing it- 
self to a depth decision based on optical 
stimulation alone, within a few minutes 
after hatching. The rat, as has been pointed 
out, will only jump after prolonged training, 
though it will descend, forepaws first, feel- 
ing its way as it looks, without pretraining. 

The first experiment with baby chicks 
used the cliff Model II with a fine pattern 
and the height set at 25" above the floor. 
Fifty-seven baby chicks 2-4 days old were 
placed on the central board, 13^ high, and 
observed for 3 minutes. As Table 9 shows, 
only 13 came down from the board, but all 
descents were to the shallow side. The sec- 
ond experiment used 1-day-old chicks on 


TABLE 9 
BEHAVIOR or Basy Снтскз on Curr MODEL II 
Shallow side Deep side 
% no 
Condition N 96 Median 06 Median descent 
descents latency. descents latency 
(seconds) (seconds) 
Fine patter 
257 ^ 57 23 80 0 — 74. 
Red checks 
10" e 27 74 297 0 = 26 


Note.—The observation period was 3 minutes for the “бпе pattern" group, but was extended to 10 minutes for the “гей checks” 
group. 
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the same apparatus with the red $^ checks 
and the height of the deep side 10" above 
the floor. For this experiment, the subjects 
were observed for a maximum of 10 min- 
utes and 20 of 27 descended to the shallow 
side, 7 staying on the board for 10 minutes. 
No chick in either experiment descended to 
the deep side. That so few chicks came 
down from the center board in the first ex- 
periment can probably be attributed to the 
Short observation interval used. While 2396 
of the subjects used in the first experiment 
came off the board in the 3-minute period, 
only 1996, a comparable number, descended 
in the first 3 minutes in the second one, but 
with a longer observation time, most of the 
subjects eventually jumped off the board. 
The average chick placed on the board re- 
mains motionless for some period of time, 
then begins to cheep and move its head from 
side to side and finally jumps off the board 
with little of the locomotion up and down 
the center board so characteristic of the rat. 

The present experiments, thus, have used 
a different technique than Thorndike used 
with 4-day-old chicks to demonstrate, as he 
did, excellent depth discrimination in the 
day-old chick. 

Adult chickens were tested on the large 
version of the visual cliff (Model III). A 
total of 50 adult chickens were used, 30 
white Leghorns and 20 Rhode Island Red- 
Barred Plymouth Rock crossbreed chick- 
ens.’ The chickens had been used as sub- 
jects in a T maze experiment; they were 
accustomed to being handled and they were 
on a deprivation schedule. Initial, pilot ob- 
servations were with satiated chickens but 
these subjects all remained on the board for 
the entire 5-minute period. To motivate the 
hungry subjects, a very small quantity of 
cracked corn was scattered along the center 
board in front of it. Usually the chickens 
ate this immediately. All subjects were ob- 
served for 5 minutes. The criterion re- 
sponse was complete descent from the board 
with both feet on one side of the center 
board. 


1 The assistance of Herbert L. Pick, Jr. in this 
experiment is gratefully acknowledged. Не both 
supplied the subjects and helped run the experi- 
ment. 


The results, shown in Table 10, demon- 
strate that the initial descent of the chickens 
was predominantly to the shallow side, but 
three animals first went to the deep side. 
Since the experiment lasted 5 minutes for 
each subject, some subjects walked back up 
on the board and came down again. Later 
behavior of the chickens can be divided as 
follows: 17 (4196) remained on the shallow 
side, 13 (32%) went up on the board but 
later descents were only to the shallow side, 
in later descents 7 (1796) walked across the 
deep side, 4 (1096) later flew across the 
deep side and landed on the glass. In addi- 
tion 3 animals flew across the deep side and 
hit the glass when the experimenters tried 
to catch them at the end of the experimental 
session. No subjects flew from the center 
board toward the shallow side. 

То help describe the behavior of the adult 
chickens, the written protocols of the first 
six subjects run are included below: 

Sı. Latency—5 minutes. Would not move until 
corn put on board, ate corn, went twice to shallow 
side, would not go to deep, at end leaped and flew 
toward deep side, hit glass, crouched. (Corn put 
on center board for all subsequent subjects.) 


52. Latency—1:30. Ate corn on board then went 
to shallow side, back to board, then to shallow, 
back to board where feet put slightly on deep, 
then flew to perch on side board at deep near pin- 
wheel (had walked around shallow side as far as 
pinwheel several times). 


Sa. Latency—0:45. Off on shallow, stayed. 


S. Latency—0:15. Off on shallow side three 
times from board; at end of session experimenter 
tried to get subject and it flew across toward deep 
side, landed on glass near pinwheel. 

Ss. Latency—0:30. Off on shallow, stayed on 
shallow. 


TABLE 10 


BEHAVIOR ОЕ ADULT CHICKENS ON LARGE 
VisuAL СілғЕЕ Mover III 


Shallow side Deep side 
% 
N no 
% de- | Median | % de- | Median | descent 
scents | latency | scents | latency 
50 76 0.62 6 2.00 18 
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Ss. Latency—0:10. Off on shallow three times 
then walked across deep; would not peck at grain 
on deep side. 

In general, the adult chickens markedly 
preferred the shallow side, Subjects on the 
center board often flapped their wings when 
looking over the edge of the board down 
toward the deep side. They frequently flew 
toward the deep side but some subjects also 
walked across it, usually with a peculiar 
high stepping gait and some wing flapping. 
The chicken is an animal with good depth 
discrimination, as is shown in the initial de- 
scent behavior, but wings offer protection 
from harm at heights as low as those used 
in this experiment. 


Kids, Lambs, and Pigs 


Тһе goat and the sheep are interesting 
animals for studying depth discrimination, 
since they are descended (the goat in par- 
ticular) from mountain climbing ancestors. 
When they are born they walk immediately, 
like the chick. А total of 16 kids and 20 
lambs were used, their ages varying from 
24 hours to 77 days old. Animals were 
borrowed from the Cornell Behavior Farm 
for this experiment? First observations 
were carried out in Morrill Hall at Cornell 


5 We are indebted to A. U. Moore for help in 
this experiment and to Jalal Besharat and John 
Wiley for testing many animals at the Behavior 
Farm. 


—William Vandivert 


Fig. 8. Young goat being tested on visual cliff, 
Model III. 


on the large visual cliff with a 1" high 
board, on five infant goats 6-7 days old. АП 
animals walked off the center board to the 
shallow side on each trial (two trials per 
subject). The subjects avoided the cliff side 
and would not approach it. They preferred 
to look at the floor over the edge of the 
Shallow side where the side board offered 
support in preference to being too close to 
the center board. The subjects were also 
placed successively on the glass on either 
side (a textured surface directly under them 
on one side and 40" below on the other). 
The behavior of the animals was highly 
stereotyped. When placed on the shallow 
side they walked forward immediately. When 
placed on the glass of the deep side an im- 
mediate backing response was observed, the 
animals' front limbs became rigid and the 
hind limbs pushed backward. If the animal 
was forcibly pushed by the experimenter 
across the glass, the front limbs remained 
rigid until the head was over the center 
board, the front limbs 2" to 6" away from 
it. At this point, the animal suddenly leaped 
forward on to the center board and across 
to the shallow side. The complete protocols 
of these five animals are reproduced below: 

S, Abel, male, born January 31, 1959. Placed 
on board at east end of room, off on shallow at 
45 seconds, walked around on shallow. On board 
from west side, off on shallow at 4 seconds (off 
on shallow means with all 4 feet). Walked 
around on shallow for about 5 minutes. Put 
in middle of glass on deep side, Backed up until 
reached side board. Put in middle of glass on 
shallow side. Walked forward after about 5 


seconds. 


S: Aaron, male, born January 31, 1959. On 
board from east side, off on shallow at 10 seconds, 
walked around. On board from west side, off on 
shallow at 3 seconds, walked around. Put in mid- 
dle of glass on deep side. Backed up immediately 
until got to side board, experimenter pushed to- 
ward center board, resisted, fell on forelimbs. 
When 2" from center board suddenly, as if re- 
leased from a taut spring, got up on center board 
and walked off on shallow. 


Ss. Carl, male, born February 1, 1959, (This 
subject had weak right hind leg.) Put on board 
from west side. Off on shallow 50 seconds, 
walked around. Put on board from east side. Off 
on shallow 35 seconds, walked around on shallow. 
Put in middle of deep, backed up until reached 
board at edge. Put in middle of shallow, went 
forward after some teetering on weak legs. 
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5. Cora, female, born February 1, 1959. On 
board at west side, off on shallow at 42 seconds. 
On board at east side, off on shallow at 41 sec- 
onds. Put in middle of shallow, went forward 
immediately. Put in middle of deep, moved back- 
ward immediately against side of apparatus. 


Ss. Mac, male, born January 30, 1959. Put on 
at east, off on shallow at 7 seconds. Put on at 
west, off on shallow at 2 seconds. Put in middle 
of glass on shallow side, walked forward immedi- 
ately. Put in middle of glass on deep side, froze, 
stayed there 30-40 seconds (put very close to 
center board, forelimbs about 6” away), jerky 
movement forward, reached center board and 
leaped up on board and over on shallow side. 

After this experiment the large cliff was 
taken to the Cornell Behavior Farm and 
subsequent observation made there. The 
shallow side and the center board were cov- 
ered with the #” checked squares and the 
checked squares were also laid over the con- 
crete floor. Twenty lambs were used. АП 
went immediately to the shallow side. Ages 
of the lambs ranged from 24 hours to 65 
days. Later, an 11 additional kids, aged 15- 
77 days, behaved similarly to the first five 
kids used. АП went to the shallow side. No 
animal ever spontaneously walked across the 
glass of the deep side. 


Besharat, Wiley, and Moore covered a 
large piece of plywood with the checkered 
pattern. If the plywood was placed under 
the glass of the deep side the subjects, lambs 
or kids, would walk around on it. If the 
board was dropped the animal immediately 
froze and backed up. Lambs, which bleated 
frequently during the experiment, stopped 
bleating when visual support was removed. 
They remained motionless and trembling on 
the surface of the glass. The kids, on the 
other hand, were much more lively. They 
seldom bleated, they were more exploratory 
at the shallow side and, though they backed 
up, they kept moving on the deep side. The 
behavior with the "optical floor" is so stere- 
otyped in these species that simply by rais- 
ing or lowering the board under the animal 
its behavior can be converted from free lo- 
comotion to rigid immobility. This was 
demonstrated many times in a short obser- 
vation session and within the limits used 
(8 to 10 raisings and lowerings of the 
board) there was no extinction. 


Two pigs, 6 weeks old, were also put on 
the central board and they immediately ran 
to the shallow side. Over the void they be- 
came rigid, behavior more similar to the 
lamb than that of the goat. 

The completely predictable, stereotyped, 
behavior of these animals fits well with a 
theory of evolution. The ungulates are gen- 
erally large hoofed animals with spindly 
legs. Most of them can move their bodies 
rapidly across the surface of the land, but 
any slight misstep (a hole in the ground, 
the edge of a cliff) will throw too much 
weight on the slight supporting limbs, break- 
ing them. For such an animal, herbivorous 
but often preyed upon by carnivores, ac- 
curate and immediate depth discrimination 
and discrimination of the existence and state 
of the substratum is mandatory for survival. 
The species tested not only have evolved 
accurate depth perception but also a pro- 
tective adaptive response that immediately 
stiffens the forelimbs and backs the animal 
away from a drop-off. This protective adap- 
tive response to a potential drop-off is so 
primitive that when the kid, for example, 
is shocked with electricity in the forelimb 
the same response is obtained (see Gibson, 


1952). 


Turtles 


The aquatic turtle was tested by Yerkes 
(1904) and found to have poorer depth 
discrimination than the land forms. An 
aquatic animal is interesting to observe on 
the visual cliff, not only for depth discrimi- 
nation but also to determine the role of the 
glass. Perhaps the glass surface would ap- 
pear like water and attract more descents 
to the deeper side, or the responses might 
be equally divided between the two sides 
because of the inadequate depth discrimina- 
tion of this species. The turtle selected for 
this experiment was Pseudemys scripta ele- 
gans, a turtle inhabiting ponds and spending 
time on land only for nesting or migration. 

The turtles were placed on the large vis- 
ual cliff at one end and observed until they 
crawled off the board onto the glass. They 
were then picked up, usually after they had 
crawled 4-5’ over the glass, and replaced on 
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the opposite end of the board. A total of 
6 trials was given for each animal, if possi- 
ble. Because the animals often did not move 
in 15 to 20 minutes, all trials were not nec- 
essarily given in one session but, where nec- 
essary, spread out over several days. Even 
50, not all animals came off the center board 
six times, but all animals except one had 
at least four trials. This turtle spent several 
hours on the visual cliff without moving. 
The results are shown in Table 11. Eight 
out of 10 of the turtles first went to the 
Shallow side of the apparatus and subse- 
quent trials still showed a marked prefer- 
ence for the shallow side. Three turtles 
showed perfect discrimination, three went 
five out of six times to the shallow side, 
three showed no preference or a tendency 
to go most frequently to the deep side, and 
one animal, with only one trial, is incon- 
clusive. When the turtle does come down 
on the deep side it behaves the same way 
that it does on the shallow side. It crawls 
slowly forward and does not look down. 
However, many subjects started toward the 
cliff side, paused, and turned around to go 
off the board on the shallow side. 

In summary, the aquatic turtle tested here 
Showed discrimination of depth, supporting 


the observations of Yerkes, but its depth 
discrimination was not as marked as that 
of other species tested, such as the goat. In 
fact, depth discrimination in aquatic turtles 
was the poorest of any species studied so 
far. 


Cats 


That a cat has good discrimination of vis- 
ual depth seems almost self-evident, in view 
of its ability to pounce at and seize prey, 
and to perform such skilled visual-motor 
coordinations as walking along a fence top. 
But very little, if anything, is known about 
the development of such discriminative 
ability. The visual cliff seemed to present 
an excellent means of testing a kitten’s 
ability to discriminate differences in depth 
as early as locomotion was possible. 

Our first attempt to study behavior of 
kittens on the visual cliff was a failure, 
since the kittens were tried immediately 
after the eyes had opened (about 10 days) 
and proved unable to locomote with any 
control at all of motor coordination. They 
fell backward or lay on the center board 
and mewed. The falls occurred because the 
center board was too narrow. 


TABLE 11 
BEHAVIOR OF TURTLES ON LARGE VISUAL CLIFF 
Shallow side Deep side 
Place 
Turtle of first Number Median Number Median 
descent of latency of latency 
descents (seconds) descents (seconds) 
1 shallow 6 180 0 
2 чер 2 30 4 28 
3 shallow 1 900 0 
4 shallow 6 180 f 20 
5 shallow 5 35 
6 shallow. 6 145 0 e 
7 shallow 5 40 
8 deep 2 180 2 135 
9 shallow 1 120 3 un) 
10 shallow 5 90 1 
"Total 39 12 
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The second attempt to test cats was 
planned with knowledge of the earlier mis- 
takes. It was decided to test the cats only 
after the visual placing response had ma- 
tured, and to observe them on the large cliff 
with a center board 9" wide. The height of 
the center board was raised or lowered as 
the size of the kitten required. Two litters 
of kittens were observed, each six times, for 
a 2-minute interval. All the trials took place 
the same day. The first litter was obtained 
from a farm. The kittens were at least a 
month old and rather wild, since they had 
lived in а barn and had never been petted. 
The second litter was brought to the labora- 
tory when the kittens were 10 days old and 
was reared thereafter in the laboratory. 
These kittens were very tame. They were 
tested when they were 27 days old. 

Table 12 presents the results for these 
two litters of kittens. Half the first litter 
was tested with a 1" center board. This 
board proved to be much too low, since 
these kittens were very agile and easily 
touched the glass surface in all directions, 
leaning far out without falling. For the 
other four kittens, the board was raised to 
8". They could not touch the glass, but it 
was an easy jump for them. The visual 
placing response was present in every kitten. 

The first four kittens (with a 1" center 
board) walked to the shallow side 7996 of 
the time, and to the deep side 2196 of the 
time. None of them remained on the board. 
'The second four kittens (with an 8" center 
board) jumped 67% of the time to the shal- 


Fig. 9. A kitten looking over edge of cliff, 
Model III. 


low side, 496 (only once) to the deep side, 
and 2996 of the time did not descend in the 
2-minute interval. The kitten which went to 
the deep side appeared to be attempting to 
escape from the experimenters. 

The second litter of six kittens was all 
run with a 24” board when the kittens were 
27 days old. They were smaller than the 
other kittens, but had good motor coordina- 
tion. All of them had a well-developed vis- 
ual placing response. They went 86% of 
the time to the shallow side and never to the 
deep side. There was no descent 14% of 
the time. 

The behavior of the cats on the glass of 
the shallow side was confident and normal. 
They walked or ran about, especially the 
first litter of rather wild kittens, which ran 


TABLE 12 


BEHAVIOR or LicHT-REARED KITTENS ON LARGE 
VisuAL CLIFF 


a 


Times to Times to No de- 
deep 


Kittens S shallow scent 


Barn, 1 month 
old, 1” board 


% 


Barn, 1 month 
old, 8" board 


% 
Tame, 27 days 
old, 214” 
board 


WEE. kitten was tested six times for 2-minute intet- 
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from the experimenters when an attempt 
was made to remove them. The kittens 
were each placed, after the experiment, on 
the clear glass of the deep side. The be- 
havior here was a great contrast to the 
walking and running about on the shallow 
side. The kittens backed up, looking con- 
stantly downward. The backing was fre- 
quently round and round in a circle, since 
the kitten was looking downward and could 
not see where it was going. Sometimes it 
succeeded in backing to the wall; then it 
proceeded to creep around the edge, hugging 
the wall, and walked on what appeared to 
be a small path below the glass made by 
the supporting board. One kitten backed to 
the wall, climbed the edge and clung to it. 
None of these kittens walked forward on 
the clear glass of the deep side, as they had 
on the other side. This behavior was some- 
times accompanied by shivering and mewing. 

That there was a strong preference for 
the shallow side is evidenced by the trials 
for the individual cats, as well. No cat ever 
went more often to the deep side. One cat, 
the wildest, went an equal number of times 
to the two sides. It also jumped from the 
table side to the floor when the experimen- 
ters tried to catch it. 

The data as a whole permit the conclusion 
that a young cat of 27 days, when reared 
normally, has developed a good discrimina- 
tion of depth and avoids a deep visual drop- 
off. 


Dogs 


A dachshund puppy and a litter of cocker 
spaniels were tested on the large visual cliff. 
The dachshund puppy was put on the center 
board, raised to a height of 24”, and ob- 
served for 2-minute intervals with the dog's 
owner standing alternately at the far end of 
either side. The puppy had been petted a 
good deal. It rushed to its owner when he 
stood at the shallow side, and cried and 
whimpered when he stood at the other side. 
It never crossed to him over the glass of 
the deep side. 

The cocker spaniel puppies, a litter of six, 
were tested at 7 weeks of age. The center 


board was raised to a height of 21" and 
each animal was placed six times on the 
center board. After this period of testing 
each animal was placed successively on the 
deep and the shallow side, twice on each 
side, and observed. The results are shown 
in Table 13. No puppy chose the deep side 
for descent from the center board. The 
overall median latency for descent was 6 
seconds and the median latencies for indi- 
vidual puppies ranged from 3 to 20 seconds. 
When placed on the shallow side the pup- 
pies walked forward; on the deep side, they 
usually remained motionless for a period of 
time and then backed up. Two of the pup- 
pies visibly trembled when placed on the 
deep side and another one squealed, behavior 
not observed on the shallow side. These 
protocols are also reproduced in Table 13. 
In the case of the two animals that trembled 
when placed on the deep side, a piece of 
masonite was raised from below and put 
right under the glass. The puppies immedi- 
ately walked forward. 

The data, thus, show that these dogs had 
excellent discrimination of depth. 


Monkeys 


Two infant rhesus monkeys? were ob- 
served several times on the large cliff (Model 
III). The cliff happened to be arranged so 
as to equate density of texture for the two 
sides (see later experiments). Motion par- 
allax, and, for the monkeys, probably bi- 
nocular parallax were available as cues. 
One (male) was observed at 10 days, at 18 
days, and at 14 months. At 10 days, this 
animal (Albert), left the center board at 
once and scampered around the shallow side. 
When placed on the board a second time, 
with arms placed by the experimenter on 
the glass of the deep side, he went to the 
wall at the edge and followed it round the 
deep side, coming back to the center board. 
Put on again, he crossed over the glass of 
the deep side to the wall. The manner of 
locomotion was different from that on the 
shallow side, for he crawled with stomach 


э Made available through the courtesy of Robert 
Zimmerman. 
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close to the glass and looked down con- 
stantly. When he reached the wall, he clung 
to it. A blanket was placed on the glass at 
the far edge of each side as a lure. 

At 18 days, his bottle was used as a lure, 
first on one side and then on the other for 
a number of trials. At this time, he went 
consistently to the bottle at the shallow side, 
but always refused to cross the glass over 
the deep side. Other testing during the in- 
terval between 10 and 18 days showed that 
his form discrimination had developed mark- 
edly during the time. The behavior on the 
cliff had changed greatly since the trial at 
10 days and showed a clear distinction be- 
tween the two sides. 

At 14 months, Albert became very emo- 
tional when placed on the cliff. He hugged 
his legs, head down on the board, and re- 
fused at length to move. He finally leapt to 


his blanket when it was waved at him on the 
shallow side, but refused to reach for it on 
the other side. 

The other monkey, a female (Maya) was 
observed first at 12 days. She seemed im- 
mature in both motor and perceptual ability. 
She crawled to the shallow side, and was 
also coaxed eventually onto the deep side. 
Her gait differed; she dragged her hind 
legs and looked down constantly on the deep 
side. 

Maya was observed again at 35 days. She 
consistently went over the shallow side for 
her blanket but refused over and over to 
cross the deep side for it. Several times 
she retreated to the shallow side, even 
though the blanket was being proffered 
from the other side. 

She was placed finally on the glass over 
the deep side. She lay prone, arms and legs 


TABLE 13 
BEHAVIOR OF COCKER SPANIEL PUPPIES ON LARGE VisUAL CLIFF 
Times to Times to No Behavior on Behavior on 
5 shallow deep descent shallow side deep side 
1 6 0 0 (1) walked forward (1) sat up for 20 seconds then 
(2) walked forward backed up 
(2) sat up for 40 seconds then 
two steps forward, then 
backed up, then walked for- 
ward to center board 
2 5 0 1 (1) walked forward (1) lay down, after 10 seconds 
(2) walked forward started to tremble, experi- 
ment stopped after 1 minute 
(2) lay down 30 seconds then 
started to tremble 
3 6 0 0 (1) walked forward and | (1) sat still for 40 seconds then 
then lay down backed up 
(2) walked forward and | (2) lay down for 40 seconds then 
then lay down backed up 
4 6 0 0 (1) walked forward slowly| (1) lay down, after 1 minute 
(2) walked forward slowly! started to tremble 
(2) lay down and whimpered 
5 6 0 o | (1) walked forward (1) sat still upright for 8 seconds 
(2) walked forward then squealed and backed up 
(2) sat still upright for 20 sec- 
onds then backed up 
6 6 0 0 (1) walked forward (1) sat still upright for 20 sec- 
(2) walked forward onds then backed up 
(2) sat upright for 20 seconds 
then backed up 
926 97 0 3 


ee R0 
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curved out at the sides, and looking down- 
ward, much as if she were clinging to a tree 
branch. 

Both these monkeys appeared to have ma- 
tured in depth discrimination between the 
first and second tests. Thus, locomotion 
appeared to precede by a week or so good 
depth discrimination in these animals. In 
its natural environment, the monkey would 
be carried by its mother during this period. 


Human Infants 


After the large visual cliff was first con- 
structed, and a 50-pound weight had been 
placed on the glass to make sure it was safe 
for infants, an 18-month-old male infant, 
who had been walking since he was 10 
months old, was placed on the center board. 
He crawled off the center board to the 
shallow side and stood up. He could not 
be persuaded to walk across the glass of the 
deep side, but he was then picked up, placed 
on the glass of the deep side, and coaxed 
toward the center board. He firmly clutched 
the wooden support with one hand, curled 
his toes and hitched himself cautiously to- 
ward the experimenter. When he reached 
the center board he crawled up on it and 
ran over the glass of the shallow side. 

But this child was 18 months old, and 
had fallen, according to his parents, from 
cribs, beds, sofas, chairs, etc., on to the hard 
floor or the bare ground. Would children 
who had just learned to crawl behave simi- 
larly to this child? Or had this child learned 
caution from falling? The following report 
describes the testing of 36 infants, 6-14 
months old, that were crawling, according 
to their mothers. 

The first infants were placed on the cen- 
ter board and observed for several minutes, 
a procedure that had worked well with ani- 
mals. The mother was placed behind a 
Screen where the child could not see her. 
A. toy (red, white, and blue pinwheel) was 
placed at the end of both sides. It rotated 
slowly and emitted a tinkling sound as it 
turned. The protocols of the two subjects 
tested with this procedure follow: 


S, Girl, 9 months old. Looked a number of 
times to both sides and at both pinwheels, touched 


shallow side but wouldn't go either way. (Mother 
behind screen. Total testing time 8 minutes.) 


E Girl 6 months old. Looked mostly at ex- 
perimenter, a little at pinwheels; crawled half- 
way across board, slipped on shallow side, back 
to board, then off on shallow again, near experi- 
menter at west side of room by now, cries and 
is picked up by experimenter; put face down to 
glass on deep side during testing, peered under. 
(Mother behind screen. Testing time 7 minutes.) 

The third subject marked a change in 
procedure. After the infant would not 
move the mother was put at the shallow 
side. She turned the pinwheel and talked 
to the child. The protocol follows: 

5. Male, 10$ months old. (Mother at shallow 
side, experimenter at deep side.)—child cries a 
litle; at 4 minutes 50 seconds from time of first 
starting testing gets off board toward mother, goes 
to lure, touches it. (Mother at deep side, experi- 
menter at shallow side.)—cries when put back on 
board; knee on shallow side toward experimenter 
but won't go farther, cries. 

The standardized procedure evolved was 
as follows: The mother stood twice at each 
side, alternating, some mothers starting. at 
the shallow side, some at the deep. The 
mother stood for 2 minutes at each side un- 
less the child got off the board and reached 
a lure. If this happened, the child was put 
back on the board and the mother switched 
sides. An experimenter stood at each end 
of the board so as not to influence the way 
the infant crawled. If the child crawled 
away from the mother the experimenter 
went toward the infant to safeguard him. 


The use of this standardized procedure 
clearly showed that the babies discriminated 
depth. They crawled toward the mother 
when she stood at the shallow side and re- 
fused to cross the glass to her when she 
stood at the deep side. Many infants crawled 
to the shallow side when the mother stood 
at the deep side twirling the pinwheel and 
urging him to come to her. Eleven subjects 
did this; no subject crawled away from the 
mother across the deep side when she stood 
at the shallow. Some of the babies cried 
when the mother stood at the deep side and 
would not go to her. In such cases, the 2- 
minute observation period from the deep 
side was usually terminated at 1 minute. 

Once the procedure was standardized, 
from Subject 10 on, the infants tended to 
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—William Vandivert 


Fig. 10. This mother has just placed her child 
ara center board in preparation for testing the 
ant. 


behave very consistently. They crawled to 
the shallow side twice; in only two cases 
did the child go but once to the shallow side. 
The three negative cases, all boy infants, 
were also consistent; each child crawled 
twice to the mother across the deep side, 
twice to her at the shallow side. 

When 30 subjects had been run, there 
were five subjects in the youngest (6-7 
month) age group. Of these infants, three 
had not moved from the center board, one 
had gone to the shallow side only and one 
to both sides. Even though two cases is not 
a large sample, one of the two had crawled 
across the deep side and it seemed possible, 
a trend that had to be checked, that very 
young infants could not discriminate depth 
as adequately as older ones. Consequently, 
telephone calls were made to mothers in the 
city with infants 6-74 months old. Very 
few of these infants were crawling, but five 


subjects were added to the youngest group. 
Of these five children, two remained on the 
center board and three crawled only to the 
shallow side. The indication was, therefore, 
that younger infants have as adequate depth 
perception, if they can be tested, as the older 
ones. 

The results on the first 36 subjects run 
are shown in Table 14. The only age trend 
is the inadequate locomotor ability of the 
younger subjects. They evidently crawled 
at home but not in a strange place. One 
must recognize that Table 14 is not a ran- 
dom sample of babies at the indicated ages, 
but a sample of infants whose mothers say 
they crawl. It is probably slightly skewed 
toward younger developers in the 6-7 months 
old group. 

There is much interesting behavior to be 
observed in this situation. The babies were 
attracted by the lure and when they reached 
it, played with it eagerly. They peered down 
through the glass, sometimes patted it or 
leaned on it with their faces, yet refused 
to cross. Some used the deep side for sup- 
port with one knee, others backed partly out 
across it (in first locomotion in the human 
infant the child often goes in reverse when 
he means to go forward), yet they still re- 
fused to cross. Tt was as if the infant could 
not recognize the consequences of his own 
actions, since he had already been where he 
now refused to go. The attitudes of the 


TABLE 14 


BEHAVIOR or HUMAN INFANTS ON THE 
VisuAL Curr 


Age of infant 
(months) 


Did not move off 
center board 

To shallow side only 

To deep side only 

То both sides 


Total 
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mothers were interesting as well. The pre- 
dominant impression among mothers seemed 
to be that the child had failed the "test" be- 
cause he did not have enough sense to real- 
ize the glass was safe to crawl over. The 
glass on the deep side was banged with 
hands and fists; cigarette boxes, lipsticks, 
purses, crumpled bits of paper, and other 
releasers of infant approach behavior were 
proffered—but the babies still refused to go 
across the glass of the deep side. 
Reproduced below are two protocols from 
the experiment. The first one illustrates the 
youngest negative case, and the second the 
normal behavior of an older infant. 


Sw. Male, 64 months. (Mother at shallow.)— 
baby looks at both lures, turns toward mother 
and crawls to her after 60 seconds, (Mother at 
deep.)—turns to her at 15 seconds, starts to 
her at 20 seconds, and gets there in 50 seconds. 
(Mother at shallow,)—goes to her at once, gets 
there in 15 seconds. (Mother at deep.)—baby 


5 


-- Fig. 11. A mother calls to Ке uy a the 
deep side of the apparatus, but he refuses to 
cross to her. (Reproduced from Scientific Ameri- 
сап by permission) 


starts toward shallow side; mother goes toward 
him; he goes back toward her, reaches her at 
deep in 60 seconds; then looks down through 
glass and starts back to center board. 

Sm Male, 11 months. (Mother at deep.) —he 
looks both ways and at experimenter; whimpers ; 
looks other way, whimpers (2 minutes), (Mother 
at shallow,)—goes instantly to shallow side. 
(Mother at deep.)—baby goes other way (to 
shallow) at once, cries, starts toward mother, 
cries, (Experiment terminated here at 1 minute.) 
(Mother at shallow.)—gets to mother in 3 sec- 
onds, 

These data show that the average human 
infant discriminates depth as soon as it can 
crawl. By the time that locomotion is ade- 
quate, which is the time when depth dis- 
crimination is necessary for survival, the 
infant can discriminate depth. In this the 
human infant fits with other late maturing 
species we have studied, the rat and the cat, 
But the human infant does not have quite 
the same marked apprehension of depth as 
the goat or the sheep, and a few crawled 
across the deep side, One also notes the 
relative clumsiness of the human infant at 
this age. Despite adequate depth discrimi- 
nation many babies would have fallen but 
for the glass on the deep side to protect 
them. 

But there is no evidence from these data 
that apprehension of height is learned from 
prior experience with falling. The avoid- 
ance and apprehension of height seems in 
general to be present as soon as an infant 
has adequate locomotion. 


Conclusion 


Comparative studies of depth discrimina- 
tion on the cliff revealed that all the animals 
studied (hooded and albino rats, chickens, 
goats, lambs, pigs, dogs, turtles, cats, mon- 
keys, and human infants) have some capac- 
ity for discriminating depth by visual cues 
alone. The relative excellence of the ability 
cannot be evaluated with great accuracy, but 
there was evidence that the turtles studied 
were inferior to the other species. The al- 
bino rat gave some (but only slight) evi- 
dence of poorer discrimination than the 
hooded. 

The remarkable fact, indeed, is that ani- 
mals with such widely differing eyes—a 
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panoramic ocular system in the case of goats 
and sheep, for instance—show similar be- 
havior in this one respect. The human and 
the monkey infants, though better able to 
utilize binocular cues, were certainly not su- 
perior to the other young animals tested, 
some of whom (the chick and the goat, for 
instance) exhibited highly discriminative be- 
havior a few hours after birth. 

The ecology and “habits” of a species 
obviously must be considered in interpreting 
the developmental differences between spe- 
cies. The monkey and the human infant 
are normally carried by their mothers for 
a considerable time before locomotion is in- 
dependent. That maturation of perceptual 
abilities should parallel this “plan” is hardly 
surprising. The cat, which is ordinarily hid- 
den in a dark place by its mother for some 
weeks, is also a relatively late maturer; but 
the ungulates must be ready to follow and 
run at once, and both motor and perceptual 
capacities appear to be adequate very soon 
after birth. 


EXPERIMENTS ON THE EFFECTIVE STIMULI 
FOR DISCRIMINATING DEPTH ON THE 
VisuaL CLIFF 


It was pointed out, in describing the ap- 
paratus, that a jump or sudden transition 
in texture density (Figure 1) characterizes 
an edge, and that a difference in the amount 
of this transition (Figure 2) characterizes 
the difference between the shallow and the 
deep edge of the platform, i.e., the amount 
of drop-off. When the same patfern is used 
directly under the glass on one side and at 
the floor level on the other side, the size of 
the elements of the texture, as projected to 
the head of the animal, will be quite differ- 
ent. Is it this texture density difference 
that is effective in producing the differen- 
tial behavior? Would the texture density 
difference suffice alone? What other stimuli 
are available and could they also serve as 
cues? 

Although Figures 1 and 2 do not show it, 
two other types of stimulus information 
about the drop-off are possible. The first is 
a jump in the motility of the texture at the 
edge, caused by head movement parallax, 


and the second is a jump in the disparity of 
the texture at the edge, caused by binocular 
parallax. (A diagram of the first is to be 
found in Gibson & Walk, 1960.) The ques- 
tion is whether our animals could register or 
respond to this supplementary information. 

The anatomical evidence suggests that all 
animals with eyes are sensitive to motion 
perspective, but that only some can register 
disparity perspective—presumably those with 
the greatest degree of overlap of the binocu- 
lar fields of view. It is known that man can 
see depth in a stereoscope, but what other 
animals can do so is not established. We 
did not attempt to test for the effectiveness 
of binocular disparity as a cue, but did for 
motion parallax, since it is universally po- 
tentially present as a stimulus. We decided 
to try to isolate the difference in texture 
density from any accompanying difference 
in motion parallax, and then to isolate the 
difference in motion parallax from any ac- 
companying difference in texture density. 

The use of motion parallax is not only 
possible, but highly probable for all the sub- 
jects. Head moyements of the animal as it 
moves its head or walks along the edge will 
produce a much greater differential velocity 
of the texture elements, relative to the edge 
of the board, for the deep side than for the 
shallow. Or, if the animal compares one 
side with the other, the apparent velocity of 
the optical textures on the two sides pro- 
vides a differential stimulus for the two. The 
density difference, then, and the parallactic 
motion of the projected stimuli caused by 
the animal's own movement appear to be 
two universal types of potential stimulus in- 
formation for depth in the cliff situation. 
Are they effective, as well as potential? 

The following experiments, then, tested 
the role of these two stimuli by attempting 
to separate them experimentally. Rats and 
baby chicks were used principally for these 
experiments, since the smaller apparatus for 
them made total control of stimuli easier. 
The large room required for testing the hu- 
man infants, for instance, made isolation of 
the sources of differential stimulation im- 
practical. The rats and chicks were plenti- 
ful and their entire field of view could be 
manipulated experimentally. 
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Comparison of Texture Densities 


Is a coarse texture density preferred to 
a fine texture density, when other cues to 
depth are eliminated? The first experiment 
on this question was performed with adult 
hooded rats. The second apparatus (Model 
II) was employed with the coarsely checked 
texture ($^ squares) placed on one side of 
the apparatus, and an identical but smaller 
pattern (1^ squares) on the other side. The 
textured patterns were directly under the 
glass on both sides. There was, therefore, 
no actual difference in depth between the 
two sides. But the density difference was 
the same as that projected to the animal's 
eye when the coarse pattern was on the floor 
10” below on one side and under tlie glass 
on the other. If a density difference alone 
is an effective stimulus for perceiving a 
depth difference, the, animals might be ex- 
pected to show a preference for the side 
having a coarse texture. The two sides ex- 
hibiting the coarse and the fine texture were 
alternated, in this experiment, so that no 
bias for one direction or the other could be 
confounded with the density difference. 

Table 15 shows that these hooded adult 
rats did exhibit a significant preference for 
the side floored with the coarser texture. 
The preference is not quite as pronounced 
as that usually found with these animals 
when the density difference is caused by one 


TABLE 15 


ANIMALS DESCENDING TO THE COARSE OR TO THE 
Fine TEXTURED SURFACE, DIRECTLY UNDER 
Grass oN Born SIDES 


% to % to 
coarse fine % 
Апїта1 N | textured | textured | no de- 
surface | surface | scent 
Adult hooded 
rats 37 84 16 0 
Adult albino 
rats 16 69 19 12 
Infant hooded 
rats 24 87.5 8.3 4.2 
Baby chicks 
(1-2 days) 46| 33 46 21 


surface placed farther below than the other 
(93% as compared with 10096). 

A group of adult albino rats was also run 
on this experiment, but the density ratio was 
increased. The squares of the coarse tex- 
ture were 13^ on a side, while those of the 
fine texture were $”. Under these condi- 
tions, the albino rats showed some prefer- 
ence for the coarse texture (69% descended 
on this side, compared with 19% on the 
other). 

A group of young hooded rats (29-31 
days old) was run with the same density 
difference as the adult hooded group. They 
showed a similar preference (87.596 to the 
coarse texture and only 8.3% to the fine 
texture). АП the rats, then, showed a dis- 
crimination of texture density and a prefer- 
ence for the coarser one. 

Апу textured surface, actually, is prob- 
ably preferred to a homogeneous one which 
provides no cues for a surface. A group of 
hooded rats (N = 10) was run with 1^ 
checks on one side of the board and a homo- 
geneous grey surface on the other, both di- 
rectly below the glass. АП of the animals 
that left the board (90%) descended to the 
textured side. 

A control experiment was performed to 
find out whether rats would prefer coarse 
textures to fine textures in a different situa- 
tion. A Y maze was covered with textured 
surfaces as follows: the stem of the Y was 
covered with homogeneous grey and one 
arm was covered with the 2" checks, the 
other with 1^ checks. Twenty-four animals 
were placed on the grey stem and the ex- 
ploratory behavior was observed for 3 min- 
utes. The animals explored the maze freely 
but neither in the initial choice of arm (10 
chose the coarse pattern, 12 the small one, 
2 stayed on the grey) nor in total amount 
of time spent on either arm did they ex- 
hibit any preference for the coarse texture. 
Tt would seem, then, that the preference for 
the coarse texture is related to descent from 
a board where the animal has a choice of a 
coarse or a more finely textured surface. 
That the behavior is based on choice be- 
tween two surfaces was suggested by an- 
other experiment where the surface below 
the animal was varied in texture but the 
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cliff was one-sided as in the threshold ex- 
periment (see earlier discussions). Latency 
of descent was measured to four surfaces: 
red checked squares of 14”, 2", 1", and i. 
all 4” below the animal and directly under 
the glass. Latencies tended to be longer only 
for the Ф” squares; the others were the 
same. 

It was a matter of considerable interest 
whether another species would show this 
preference for descending to a coarsely tex- 
tured floor as opposed to a finely textured 
one. Accordingly, a group of baby chicks, 
1-2 days old, was run with the same appa- 
ratus and conditions as the hooded rats (ex- 
cept for lowering the center board height to 
2”). The chicks were observed for a 10- 
minute interval, rather than a 3-minute one, 
since they were much slower to descend. But 
the chicks, as Table 15 shows, did not be- 
have like the rats. There not only was no 
preference for descending to the coarse tex- 
tured floor, there was even a greater tend- 
ency to descend to the finely textured sur- 
face. The difference (33% vs. 46%) is not 
significant, but the chicks differed signifi- 
cantly from the rats. 

The large cliff (Model IIT) was employed 
to test two young goats (12 days old) in 
a similar setup. The pattern on one side 
was 3” squares, in a checkered pattern, that 
on the other 3” squares. The center board 
was 3" high. Both kids left the center 
board at once, one walking first to one side, 
one walking first to the other. Both wan- 
dered about impartially from one side to the 
other. It is conceivable that a preference 
might have appeared if they had had to de- 
scend from a greater height, but under these 
conditions the behavior was equivalent for 
the two sides. 

Why should the rats descend nearly al- 
ways to the coarse textured surface, com- 
posed of large elements, whereas the chicks 
and goats show no such preference? This 
difference in choice of substrata might be 
an innate species difference. We „cannot 
draw such a conclusion from these data, 
however, since there is another difference, 
that of age, between the chicks, at least, and 
all the rats. Most of the chicks were less 
than 1 day old, none more than 2. The 


goats, at 12 days, were somewhat older. But 
even the young rats were a month old and 
had, therefore, had more visual experience. 
This experience might somehow provide an 
opportunity to acquire the preference for 
descending to a coarsely textured substra- 
tum. The possibility was checked in the ex- 
periments following in the section on dark- 
reared animals. 


Motion Parallax as a Cue 


In order to isolate the role of motion 
parallax from density, the density ratio had 
to be one; that is, there had to be no differ- 
ence in optical texture density between the 
two sides, although one remained farther 
from the animal’s eye than the other. Two 
patterned surfaces were obtained which 
were identical in coloring and pattern (al- 
ternate squares of white and red) except 
that the squares in one were three times the 
size of the other. The material with larger 
squares was placed on the floor below the 
apparatus. The material with the smaller 
squares was placed directly under the glass. 
The height of the apparatus was adjusted 
so that the floor on the deep side was ex- 
actly three times as far from the animal’s 
eye as was the surface of the shallow side.” 
Thus, texture density (or angular size of 
the elements) was equated for the two sides. 
But since one surface was farther away, any 
movement of the animal's head would pro- 
duce differential motion by virtue of paral- 
lax. That is, the elements of the closer sur- 
face would appear to move with a greater 
velocity than those of the other. This differ- 
ence was very easy for the experimenters 
to observe. That the animals did move their 
heads and thus produce this differential 


30 Assuming that the animal's eye is 1" above 
the platform whose surface is 4" above the glass, 


the distance to the shallow surface is 5", to the ' 


deep surface 15". If the animal's eye is not pre- 
cisely 1" above the center board there is a slight 
density difference but it is much less than in the 
standard situation and probably not discriminable 
(if the animal's eye is 4" above the glass, the ap- 
proximate ratio is .86 to 1.00, at 5" 1.00 to 1.00, _ 
and at 6” 1.11 to 100; when the animal's eye is _ 
5" above the glass in the standard situation the 
ratio is .33 to 1.00 or a 3 to 1 ratio). 
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stimulation was also obvious. The rats 
typically moved their heads vertically up 
and down in a bobbing motion looking first 
on one side and then on the other. They 
also frequently walked the length of the 
board looking from side to side. When the 
animal lowered his head and raised it again, 
the motion would produce an “expansion 
pattern” of motion perspective as described 
by Gibson, Olum, and Rosenblatt (1955) 
and would yield good potential information 
for relative depth of the two surfaces. 

This situation, for the rats, seems effec- 
tively to rule out any other differential stim- 
ulation for depth except accommodation (if 
it is effective at all). Binocular parallax is 
improbable for this animal, Brightness was 
of course equated for the two sides. 

Adult hooded rats were run first in the 
experiment. As Table 16 shows, a prefer- 
ence for the shallow side (83% descended 
there) existed despite the elimination of the 
texture difference between the two sides. 
Even when there was no “jump” in texture 
from one side to the other, the animals quite 
consistently chose the shallow side. The 
stimulus basis for this choice almost cer- 
tainly was the differential motion parallax. 

A group of young hooded rats (30 days 
old) run in the same situation exhibited an 
even more consistent preference. All the 
descents were to the shallow side, and none 
to the deep. It appears, therefore, that mo- 
tion parallax produced by the animal’s own 


TABLE 16 


BEHAVIOR OF ANIMALS WHEN TEXTURE DENSITY 
(ANGULAR SIZE OF ELEMENTS) Is EQUATED 
FOR THE Two SIDES 


% de- | % de- 
5. N |scentto|scentto| % no 
shallow | deep | descent 
Adult hooded 
rats 18 83 17 0 
Infant hooded 
rats 25 12 0 28 
1-day-old chicks | 27 89 4 7 
Young goats 17| 100 0 0 
Young sheep 12| 100 0 0 


movements is an effective differential stimu- 
lus for depth in hooded rats. 

Finally, a group of 1-day-old chicks was 
run in the same situation. The chicks do 
not often make vertical head movements like 
the rat, but typically dart the head from 
side to side. In this experiment, a majority 
of the chicks, like the rats, descended to the 
shallow side (89% to the shallow, 496 to 
the deep). Thus the chicks at 1 day of age 
probably discriminate depth primarily on the 
basis of motion parallax, since a difference 
in texture density alone was shown in the 
previous experiment to be ineffective for 
them. Perhaps the two stimulus variables, 
when present together, interact in some 
way; but it is notable that with parallax 
present, but isolated from density differ- 
ence, only 1 of 27 chicks descended to the 
deep side. 

It was decided, in an attempt to check for 
binocular cues, to try a few monocular 
chicks on the cliff. Five 1-day-old chicks 
had one eye removed surgically. Several 
hours later, they were run on the standard 
cliff, with 2" checks on the floor 10" below 
the glass on one side and directly under the 
glass on the other. Of these chicks, four 
descended to the shallow side and one to 
the deep." This distribution may indicate 
a preference for the shallow side, but this 
was dubious, since the chicks appeared to 
fall off, rather than jump. The notable fact 
about their behavior was a lack of balance. 
The chicks all leaned toward the side away 
from the remaining eye, as if they could no 
longer keep themselves in the upright po- 
sition. It seemed as if the chick depends on 
balanced light to the two eyes to keep itself 
in equilibrium. By the next day posture had 


11 Hess (1956) fitted prisms (base out) to the 
eyes of chickens and concluded, because they pecked 
short when viewing binocularly, that "binocular 
depth cues" were employed, But the phrase is 
vague. The experiment does not demonstrate that 
binocular image disparity can be utilized by the 
chicken, but only that it utilizes simultaneous in- 
formation from both eyes in pecking. It should 
be remembered that birds do not have single, con- 
centrated, central foveas, like primates, Whether 
birds can register depth at an edge by means of 
binocular parallax as well as by motion parallax 
is unknown. 
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improved, but the chicks moved around very 
little and only two descended from the 
board. This experiment is inconclusive. It 
seems likely that binocular vision in the 
chick has other functions more important 
than the providing of supplementary infor- 
mation for the discrimination of depth. 

When we were able to obtain sheets of 
material with a large checkered pattern (3^ 
checks), it was possible to equate density 
of texture with the large cliff apparatus. 
Directly under the glass on the shallow 
side was placed material patterned with y 
checks. On the floor below the deep side 
and covering a wide area was placed the 
pattern of similar 3” checks. To obtain a 
4 to 1 ratio of height, the glass was set 3’ 
from the floor, assuming that the eyes of 
the animals to be tested would be approxi- 
mately 1’ above the glass. These animals 
were kids and lambs, ranging in age from 
1-35 days old. 

'The 17 kids behaved uniformly in a com- 
pletely predictable way. They were placed 
on the center board twice, once from either 
side, and were observed until they left the 
board and for one minute longer. Every 
animal, even the 1-day-old infants, walked 
invariably to the shallow side and avoided 
the deep side. They peered over the deep 
side, but turned away from it and walked 
forward onto the shallow with a normal 
stride and posture. When they were placed 
on the glass over the deep side, all but one 
animal backed up to the wall and stood 
hunched there, legs rigid, back humped. The 
other animal lay prone on the glass of the 
deep side and did not move. 

The 12 lambs also all walked onto the 
shallow side (2 backed onto it). But their 
behavior was not as consistent as that of 
the goats. One of them, when coaxed, walked 
a few steps out on the deep side. Upon 
looking down, it backed off. One backed, 
probably without seeing where it was going, 
onto the deep side. When these animals 
were placed on the glass over the deep side, 
they backed to the wall, sometimes kneeling. 
Two of the lambs went over the side of the 
apparatus, one falling but the other appar- 


ently jumping. 


None of these animals gave evidence of 
responding to the glass as such. It was typi- 
cal for an animal to bump its nose on the 
glass of the deep side when placed there by 
the experimenter. 

Tt seems clear from these experiments 
that young goats, even at 1 day of age, 
avoid a drop-off when the motion cue is 
present and in the absence of a cue of dif- 
ferential size or texture density. Lambs 
tend to do likewise, but not as consistently. 

In conclusion, we feel that motion paral- 
lax in rats, chicks, goats, and lambs is of 
critical importance as a stimulus for dis- 
crimination of depth. Tt appears, as well, to 
be operating in extremely young and inex- 
perienced animals. ` 


Competing Cues 


The question was often asked us what 
would happen if the cue provided by texture 
density difference were put in opposition to. 
that of differential motion parallax. We ac- 
cordingly set up a situation in which the 
surface on the floor had a pattern contain- 
ing elements so large (1 ” squares) that 
even at the greater depth the density was 
less than that on the shallow side (}” 
squares, identical shape and color). The 
density cue would then favor the physically 
deep side, but differential motion parallax 
would still exist and would favor the shal- 
low side. Two groups of animals were run 
in this situation, hooded and albino rats. 

In both groups, more animals descended 
to the shallow side than to the deep, though 
the difference was not as great as in some 
of the other experiments (see Table 17). 
A quarter of the hooded rats did not de- 
scend, a rather large number for adult ani- 
mals. The evidence on the whole supports 
the conclusion that motion parallax is the 
more effective cue when the two are put in 
opposition. 


Conclusion 


This section reported (a) attempts to 
isolate the influence of motion parallax by 
equating the projected texture density on 
the two sides of the apparatus, and (5) iso- 
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lation of differential texture density as a cue 
by putting a coarse texture directly under 
the glass on one side of the platform and 
a similar fine texture at the same depth on 
the other side. АП of the species tested, 
where the size of the optical texture ele- 
ments was the same while height was varied, 
so that motion parallax was present without a 
texture difference, were able to discriminate 
between the two. sides. The monkey (see p. 
21), the goat, the sheep, and the chick seem 
to have as good depth discrimination in this 
situation as they do with the regular experi- 
mental setup where texture density is also 
a potential cue. The rat, on the other hand, 
seems to have a slight drop in accuracy un- 
der these conditions—the drop is not large 
or statistically reliable, but it is possible that 
this animal may require more cues for ef- 
fective discrimination than do the more 
strongly visual animals. It cannot be con- 
cluded that motion parallax was the only 
effective cue since accommodation may con- 
ceivably operate in all species and binocular 
disparity is debatable. 

The experiments where optical texture 
density was varied by varying physical tex- 
ture density while height was equal (the 
density preference series) yielded curious 
results. The rats definitely preferred the 
coarser of two patterns (evidence based on 
three experiments) yet neither the chick nor 
the goat exhibited a preference for either 
texture, Experiments on animals reared in 
the dark may throw more light on this prob- 
lem and will be reported in the next section. 


TABLE 17 


COMPETING CUES 


% de- | % de- 
S N |scentto|scentto| % по 
shallow | deep | descent 
Adult hooded 
rats 16 50 25 25 
Adult albino 
rats 16 62 19 19 


Note.—14" (small)-pattern on shallow surface, 154" on deep 
surface. 


EFFECTS or DarK-REARING ON DEPTH 
DISCRIMINATION ОЕ RATS AND CATS 


Rearing an animal in the dark prohibits 
prior visual experience, so that the animal's 
first “glimpse of the world” is under experi- 
mental control. Can animals reared in the 
dark discriminate depth as well as normally 
reared animals? This section will report 
three experiments with dark-reared hooded 
rats and one with dark-reared kittens. 

While some animals, like the chick or the 
goat, can be tested within 24 hours of birth 
so that dark-rearing to control prior visual 
experience is irrelevant, both the cat and 
the rat do not have adequate enough loco- 
motor ability to be tested on the visual cliff 
until they are about 4 weeks old. But dark- 
rearing creates many problems of its own. 
These problems actually may hinder the ex- 
perimenter’s search for visual naivete, since 
the differences between dark- and light-reared . 
animals may be due to factors other than the 
visual experiential ones. The problems of 
dark-rearing can be divided into physiologi- 
cal, methodological, experiential, and emo- 
tional ones. The fact that dark-rearing may 
have physiological side effects is well known: 
there was a pronounced pallor of the optic 
disk in Riesen’s dark-reared chimpanzees 
and neurological examination showed de- 
generation of retinal ganglion cells (Chow, 
Riesen, & Newell, 1957). Other physiologi- 
cal effects of dark-rearing may not result 
in direct damage to the visual system but 
may interfere instead with other bodily proc- 
esses and this could affect the normal be- 
havior of the animal when it is tested. For 
example, the light cycle controls gonadal de- 
velopment in birds; thus in some species 
dark-rearing might induce aberrations in be- 
havior unrelated to vision that could lead 
to different behavior in the testing situation 
(cf. Brown, 1959; Hendricks, 1956), A 
second source of difficulty is methodological. 
Few studies of dark-rearing raise animals 
under conditions such that differential light 
stimulation is the only experimental vari- 
able. The light-reared, for example, can 
usually see laboratory personnel and thus 
become adapted to the sight of humans. 
Testing conditions may also not be equiva- 
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lent for the dark-reared and the light-reared. 
The practice of keeping the dark-reared in 
the dark except for testing while the light- 
reared are kept in the light, causes differ- 
ential dark adaptation which, rather than 
prior experience, might be a factor. A 
methodological problem arises in connection 
with light stimulation itself. The only 
variable purposely controlled has been pat- 
terned light exposure as compared to non- 
patterned, translucent exposure (Mishkin, 
Gunkel, & Rosvold, 1959; Riesen & Aarons, 
1959; Siegel, 1953). Yet the studies of 
Hebb’s students (Forgays & Forgays, 1952; 
Hymovitch, 1952) have shown that depth in 
the field of view may be an important vari- 
able: animals reared with adequate patterned 
light experience but with a field of view that 
included only near objects did not perform 
as well in maze learning as those with an 
unrestricted field of view. The third problem 
in dark-rearing, that of the control of spe- 
cific early behavior, has not been the topic 
of research, but its possibility is well known. 
The dark-reared animals may learn other 
habits, to rely on nonvisual cues for ex- 
ample, and hence perform inadequately in 
visual problems whether their present visual 
system is adequate or not. Dark-reared ani- 
mals may also be more “emotional” (Gibson 
et al., 1959). This can be due to disturbance 
at the unexpected light stimulation or it 
might conceivably be a physiological side 
effect of dark-rearing. 

All of these factors show that dark-rear- 
ing for control of prior visual experience 
may lead to complications. For these rea- 
sons, animals in the following experiments 
were maintained in as equivalent conditions 
as possible, in the hope that differential light 
stimulation up to the moment of testing was 
the main variable, Weights were also taken 
of the rats to make sure the general pattern 
of development was comparable in light- 
and dark-reared. y 


Rats 


In the first study rats were reared in 
light or dark, but otherwise identical, en- 
vironments for the purpose of comparing 
depth discrimination 20 minutes after the 


dark-reared animals were first exposed to 
the light. The second study investigated 
the stimulus determinants of depth dis- 
crimination after differential rearing. Ani- 
mals were reared in the same cage environ- 
ments and one group of light-reared had 
patterned light experience but a restricted 
depth of field. In addition, all animals were 
tested after being removed from a dark 
room so that light adaptation was equiva- 
lent. The third study continued the in- 
vestigation of stimulus determinants with 
30-day-old animals. While the dark-reared 
were kept in the same environment as that 
of the second study, the light-reared were 
raised in the main laboratory colony. 

The first experiment using dark- and 
light-reared animals has been reported 
before (Walk, Gibson, & Tighe, 1957). 
Animals were kept in identical wire mesh 
cages surrounded by cardboard walls, some 
within a lightproof room and others in a 
similar, lighted room. At the age of 90 
days, animals were removed from the dark 
and tested on the original apparatus. The 
results are shown in Table 18. In depth 
discrimination, there is no difference be- 
tween dark- and light-reared. While the 
latencies look higher for the dark-reared, 
they are not significantly higher. In the 
discrimination learning portion of this ex- 
periment, the dark-reared animals were 
much harder to pretrain but this “emo- 
tionality” did not seem to affect behavior 
on the visual cliff. 

The second experiment with dark-reared 
animals also used adult hooded rats. In 
this experiment, there was more precise 
control of the environment to increase the 
certainty that the differential visual ex- 
perience was the critical experimental 
variable. Eight wooden boxes of $^ pine 
were constructed of identical size with 
outside dimensions 47" long X 12" high 
and 111" deep. All boxes were covered 
in the front by a door framed with glass 
hinged at the top, the glass measuring 
9" x 43", Additional ventilation was sup- 
plied by ventholes in the rear of the boxes 
which were mounted 21" away from the 
wall For the light-reared, five boxes 
mounted on top of each other were used. 
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TABLE 18 
DEPTH DISCRIMINATION ОЕ ADULT Ноорер RATS IN THE First Dark-REARING EXPERIMENT 
Shallow side Deep side 
2 % no 
Group N % of Median % of Median descent 

descents latency descents latency 

(seconds) (seconds) 
Light-reared 29 80 1 10 5 10 > 

Dark-reared 19 74 5 16 20 10 


Note.—In this experiment the animals were placed on the center board in a box to avoid handling bias. 


The remaining three were for the dark- 
reared. The rearing procedure assured that 
ventilation, auditory, and olfactory experi- 
ence would be similar for both groups. 

The light-reared were housed in a room 
of identical size as the dark-reared. Light- 
ing was supplied by 60-watt bulbs, two 
being located on each side of the boxes, 
one pair near the top of the tier, one pair 
at the bottom of the tier, so that bulbs 
were not visible to the animals. The three 
bottom tiers were also faced with white 
cardboard 4” in front of the glass. Thus, 
approximately half of the light-reared had 
a visual environment that extended through 
the glass and around the room, while the 
others' visual environment extended only 4” 
(these will be referred to as the full vision 
group and the restricted vision group, re- 
spectively). Lighting was controlled by a 
poultry timer so that lights were on from 
10 p.m. to 10 a.M., off from 10 А.м. to 
10 р.м. The lighting cycle insured that 
when the animals were to be tested in the 
afternoon, they would be tested on emer- 
gence from the dark so that light adaptation 
would be equivalent for all groups. 

For the dark-reared rats, black card- 
board was taped over the front of the glass 
as an additional precaution against the 
admission of light. 

Litters were split and the pups distrib- 
uted among the three groups shortly after 
birth. All animals were reared in wire 
mesh cages 20" long X 9$" deep X 61^" 
high, placed inside the wooden boxes, until 
they were.30 days old. At that time, the 
animals were weaned, the sexes separated, 


and the subjects placed two to a cage in 
cages 6" wide X 6" high X 8" deep, five 
cages to a box. 

The animals were tested at 90 days of 
age, about 20 minutes after removal from 
the dark. The testing was spread over 
several days, testing by group randomized, 
and subjects returned to the dark after 
testing. All animals were in good health 
when tested. No weight differences among 
groups were observed, the median weights, 
by sex, being distributed as follows: light- 
reared full vision group, males 288 grams, 
females 193 grams; light-reared restricted 
vision group, males 310 grams, females 201 
grams; dark-reared males 295 grams, fe- 
males 189 grams. 

Approximately half of the animals in 
each group were tested on the setup with 
equal texture density but height variable 
(see section on Experiments on the Effec- 
tive Stimuli for Discriminating Depth on 
the Visual Cliff), the other half on the 
setups with texture density variable but 
actual height equal. Each animal was 
tested only once. As Table 19 shows, all 
groups behaved similarly in the equal 
density situation, about 70% descending 
from the board to the shallow side which 
contained 1^ squares under the glass and 
30% to the deep side where the 2^ squares 
were 10" below the glass. We infer, there- 
fore, that motion parallax can operate to 
produce differential behavior in all three 
groups, including the dark-reared. 

In the series with texture density vari- 
able but height equal, on the other hand, 
the three groups did not behave in the 
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TABLE 19 


Comparison or ADULT Нооркр LIGHT- AND DARK-REARED Rats ом DEPTH. DISCRIMINATION 
with Density EQUATED AND ON PREFERENCE FOR PATTERN DENSITY 


% Median % Median % no 
Group N descents latency descents latency descent 
(seconds) (seconds) 
Shallow side Deep side 
I. Equal density; height 
variable 
Light-reared, full 
vision 10 70 15 30 45 — 
Light-reared, re- 
stricted vision 11 73 21 27 24 — 
Dark-reared 12 67 12 25 27 8 
Large pattern Small pattern 
II. Texture variable; 
height equal 
Light-reared, full 
vision 11 82 10 18 18 = 
Light-reared, re- 
stricted vision 10 50 5 50 20 Гө 
Dark-reared 12 33 20 67 13 = 


same way. The light-reared full vision 
group showed a preference for the large 
pattern (82% chose this side), confirming 
the results reported in the preceding section 
for a similar experiment with adult hooded 
rats and again with 30-day-old ones, all 
reared under normal conditions in a large 
colony room, But the animals raised under 
conditions of restricted vision did not show 
a preference for either texture. The dark- 
reared group even reversed the preference, 
though the number of cases is too small 
for a statistically reliable difference. The 
results suggest that rearing conditions may 
in fact alter the role played by texture 
density as a cue. 

The third dark-rearing experiment used 
young hooded rats 30 days old. The ani- 
mals lived from shortly after birth in the 
boxes in the dark room described in the sec- 
ond experiment until ready for testing. Litters 
were not split. Instead, these animals were 
compared to young animals (25-33 days 


old) reared in the normal colony environ- 
ment. Light-reared animals were kept in 
small cages which permitted full vision 
through the sides and top of the cages. 
The young animals were tested approxi- 
mately 20 minutes after leaving the dark. 
Separate groups were tested on the stand- 
ard height discrimination, on the depth 
discrimination with texture density equated 
but height variable, and on the preference 
for two patterns of varying density (1^ 
and 8” squares) directly under the glass. 
As Table 20 shows, the young dark-reared 
animals behaved similarly to the light-reared 
on both the standard depth discrimination 
and the equal density one, confirming the 
experiments with adult hooded dark-reared 
and light-reared animals. The density pref- 
erence results were also similar to the results 
with adult animals in the second dark- 
rearing experiment, that is, a slight prefer- 
ence for the fine pattern in the dark-reared 
as against a marked preference for the 
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coarser pattern in the light-reared. For 
this experiment, the light-dark-reared dif- 
ference on size preference is statistically 
significant (p < .01). The importance of 
rearing conditions for the role played by 
texture density was thus confirmed. 

It seems clear from these experiments 
with the dark-reared animals that some 
depth discrimination in the rat is unlearned, 
that motion parallax is effective as a cue 
without learning, but that a difference in 

"texture density alone does not operate in 
the same way in light-reared and dark- 
reared rats. How experience might affect 
the role of this variable will be discussed 
later. 


Dark-Reared Cats 


In view of the near equivalence of be- 
havior of normally-reared and dark-reared 
rats, it seemed interesting to compare a 
more strongly visual animal such as the cat 
under these two conditions. Two pregnant 
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cats were brought to the laboratory shortly 
before parturition and housed there. Each 
cat was kept in a wire mesh cage 26" X 
26” X 26” in a large animal room. About 
5 days after the kittens were born, the 
cage with mother and litter was removed 
to a large darkroom. The room was well 
ventilated but no light whatsoever entered. 
Care of the cats demanded a small amount 
of light when the experimenters entered to 
feed the animals and clean cages. A pho- 
tographer’s red flashlight was used for 
this purpose. No other light entered with 
the experimenter, since the room was ap- 
proached through double doors with a small 
unlighted corridor between them. One door 
was closed before the other was opened. 
Care requiring the use of the flashlight took 
approximately 10 minutes a day. 

One of the mothers was restless in the 
darkroom. She howled and attempted to 
escape from the cage. She was, therefore, 
given a tranquilizer (10 milligrams of 


TABLE 20 


LIGHT-REARED AND DarK-REARED Hoopen Rats 30 Days OLD 
COMPARED IN PERFORMANCE ON THE VISUAL CLIFF 


Shallow side Deep side 
% no 
Condition N % Median % Median descent 
descents latency descents latency 
(seconds) (seconds) 
Standard: both height and density varied 
i 8 
Light-reared 22 77 29 5 8 1 
ра 14 79 34 14 17 7 
Equal density: height varied 
28 
Light-reared 25 72 23 0 
Di ed 20 65 35 20 19 15 
Texture density varied: height equal 
^ 8 21 4 
Light-reared 24 88 14 
Deed 22 36 18 50 10 14 
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chlorpromazine) twice a day. She was re- 
moved from the darkroom for this purpose 
and returned to her litter again immedi- 
ately after medication was administered. 
The drug had the desired effect; she 
settled down and cared for her litter as 
adequately as the other cat. The drug was 
discontinued a day before the kittens were 
brought out. It did not appear to affect 
them, however, since they seemed through- 
out to be as lively as the other kittens. 

The kittens were removed from the dark- 
room when they were 26 days old and kept 
thereafter in the lighted animal room where 
they were born. When they were first 
brought out they were kept for 20 minutes 
in а box arranged to provide only homo- 
geneous light, and were tested on the cliff 
immediately upon removal from the box. 

The light-reared kittens with whom these 
were compared have already been described. 
They were reared in the laboratory in a 
cage identical with those housing the dark- 
reared animals and in the same animal 
room where those animals were born. They 
were tested when they were 27 days old 
(see Comparative Experiments). 

Each dark-reared kitten was tested on 
the large cliff with a 24” high center board, 
exactly as the control litter (27 days old) 
had been tested. Six trials were given, one 
after another, each lasting 2 minutes. Upon 
completion of these, the kitten was placed 
in the center of the glass on each side and 
its behavior there observed. 

After the first day, the kittens were each 
observed for two trials daily for 15 days 
in order to determine the progress of fur- 
ther maturation. They were also observed 
for the visual placing response, following 
a moving object with the eyes, and general 
visual motor coordination. 

All the dark-reared kittens (N = 9), 
when first brought out, appeared to have 
no responses to visual stimulation except 
the pupillary reflex. There was no visual 
placing response (cf. Riesen & Aarons, 
1959), no following response, no blink, and 
the kittens bumped into walls which were 
squarely in front of them. Their move- 

ments were awkward; they crawled, rather 
than walked, keeping their stomachs in 


contact with the ground. When they were 
placed on the center board of the cliff, 
they either remained still (no descent), or 
fell off after moving a short way. They 
were as likely to fall forward as backward. 
Table 21 shows the place of descent (or 
no descent) for these animals. It can be 
seen that the animals descended about 
equally often to either side, and one-third 
of the time did not descend. Furthermore, 
no single animal showed a preference for 
either side. In great contrast to this be 
havior is the record of the light-reared 
kittens of the same age. None of them 
descended at all to the deep side, and 86% 
of the time they descended to the shallow 
side. Every kitten showed a preference 
for the shallow side. 

The dark-reared kittens, after the first 
day's tests on the cliff, were placed on the 
center of the glass of each side as the 
light-reared ones had been (described 
in Comparative Experiments). The light- 


TABLE 21 


BEHAVIOR OF DARK- AND LiGHT-REARED KITTENS 
27 Days OLD on VISUAL CLIFF 


7 


Times to Times to No 
shallow. deep descent 
Dark-reared 
Ia 3 2 1 
Ib 1 3 2 
Ic 2 2 2 
Id 3 3 0 
Te 1 1 4 
Па T 2 3 
Ib 0 2 4 
IIc 3 1 2 
па 3 3 0 
% 31:5 35.2 33.3 
Light-reared 

1 6 0 0 
2 5 0 1 
3 6 0 0 
4 6 0 0 
5 3 0 3 
6 5 0 1 
% 86 0 14 


Note.—Each kitten was tested six times for 2-minute inter- 


reared kittens behaved very differently оп 
the two sides ; they characteristically hugged 
the wall or backed in a circle on the deep 
side, but simply walked forward on the 
shallow side. The dark-reared kittens did 
not make such a distinction. The behavior 
varied for different animals. One animal 
circled on both sides; one walked, or rather 
crawled, toward the wall on both sides 
(perhaps getting echoes from his constant 
mewing); some simply sat still; several 
^xwalked into a wall and bumped their noses. 
The kittens, then, did not mature visually 
in the dark, Rats and cats are thus clearly 
different in this respect. But do the dark- 
reared kittens catch up to their controls 
in the light? And how long a period 
is required before normal behavior is 
achieved? Figure 12 shows the percentage 
of descents to the shallow side as compared 
to the deep side from the day the kittens 
were brought out of the darkroom through 
"the seventh day in the light. (The no de- 
scent category has been disregarded here, 
since this changed very little from day to 
day. The two curves in Figure 12 are 
thus reciprocal) On the first day, descents 
were divided about equally between the 
two sides. Thereafter, the preference for 
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Days in Light 
à Fig. 12. Percent of descents of dark-reared cats 
to a deep or to a shallow surface as a function 
of days in the light. 
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the shallow side increased rapidly and be- 
came complete (100%) by the seventh day. 
No kitten tested after the seventh day chose 
the deep side. Therefore, the normal pref- 
erence for descending to the shallow side 
developed as the kittens lived in the light. 
We infer that this is based primarily on 
maturation of visual discrimination of the 
depth differences. There was no reinforce- 
ment for choosing the shallow side; if 
anything, the kittens should have learned 
that the deep side was safe, since they 
descended there in the beginning without 
harm. 

The other visual-motor behavior of the 
kittens did not seem fully mature before 
10 days or more, in all cases. By the 
fourth day, the cats followed a moving ob- 
ject, mostly by jerking the head sidewards. 
By the fifteenth day, good pursuit move- 
ments of the eyes could be observed. The 
experimenters never observed convergence 
movements, but it is dubious whether these 
are generally present in normal cats. 

The visual placing response appeared in 
gross form in some cats after a few days, 
but it was not a well-coordinated placing 
of one paw (as it is in the normal 27-day- 
old kitten) until about 10 days out. Motor 
coordination developed gradually, too. By 
the fourth day, few animals slipped off the 
board (some still did, however), the curi- 
ous crawling locomotion had in most cases 
given way to upright walking, and some of 
the animals had lost the staring-straight- 
ahead look of the first day as eye move- 
ments began to appear. None of the times 
given here can be considered as absolute, 
since different housing conditions (a whole 
room instead of a cage, for instance) might 
have speeded up maturation. 

Finally, the behavior of the cats when 
placed on the glass of the deep side changed. 
Here are some typical protocols for four 
animals on the eleventh day out: 

1. Circles and backs to wall. 

2. Circles backward. 

3. Crawls on stomach to wall, then backs along 
edge. 

4. Crawls to wall and walks forward along 
edge hugging wall and stepping on glass over sup- 
ports only. 
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This behavior was quite different from that 
when the animals were placed on, or stepped 
down to the shallow side, where they walked 
about freely. 

We conclude from comparing the be- 
havior of dark- and light-reared kittens at 
27 days that response to visual stimulation 
had not matured normally in dark-reared 
animals. However, as the dark-reared kit- 
tens lived in normal conditions of daylight 
thereafter, there was gradual maturation 
of the response to visual stimulation, ap- 
parently complete by about 10 days. The 
reaction to visual support, or lack of it, 
was observed especially and was found 
absent on emergence from the dark but 
entirely present after 10 days in the light. 
This development occurred without any 
external reinforcement in the experimental 
situation. 


Conclusion 


At the beginning of this section certain 
problems of dark-rearing were discussed. 
Despite the difficulties of experimental in- 
ference sometimes involved in dark-rearing 
experiments, our studies yielded very clear- 
cut results. The dark-rearing experiments, 
on the hooded rat and the cat showed a 
striking species difference between these 
animals since dark-reared rats were not 
deficient in visual depth discrimination 
when first brought into the light, whereas 
the cats were. However, this ability ma- 
tured in the cats after a few days in the 
light. That the ensuing development was 
primarily continued maturation was in- 
ferred, since no reinforcement in the ex- 
perimental situation was provided. 

The role of differential texture density and 
differential motion parallax were studied 
with dark-reared rats, with the result that 
motion parallax appeared to be as effective 
as a cue in dark-reared as in light-reared 
animals, while texture density was not. 


Discussion or RESEARCH AS A WHOLE 


This research has been concerned with a 
comparative and analytic investigation of 
depth perception as evidenced by the avoid- 


ance of a cliff. This type of depth discrimi- Ж 
nation is highly adaptive and is, in fact, 
manifested by many species, as we have 
shown. In discussing the results, several 
related questions arise. First, is the dis- | 
crimination of depth learned? Second, is | 
the fear of cliff edges that presumably goes 
with potential loss of support learned? 
Finally, are species comparisons possible? 


Depth Discrimination and Learning 


It was not the purpose of this research 
to solve the "nature-nurture" problem which 
has, rightly or wrongly, caused psycholo- 
gists to do battle with one another for 
centuries (see Hochberg, in press). Our pur- 
pose, as stated, was a comparative investi- ! 
gation of depth perception with a technique 
which could be used identically for many 
species, at an early age and without special 
training. But the question was asked us 
over and over again, is it innate or is it 
learned? 

Since the technique can be applied with- 
out special training, often when the animal 
has had no previous visual experience at 
all, there are, in the results, data which are | 
relevant to the nature-nurture issue. It 
seems well to state these explicitly and to 
consider what light, if any, has been thrown 4g 
on the problem. 

Two kinds of experiments have a direct 
bearing. First, the experiments on very 
young animals (1-day-old or less), and 
second, experiments with dark-reared ani- 
mals. 

Experiments with animals less than a 
day old were possible with chicks, lambs, 
and kids, since these animals are capable 
of locomotion shortly after birth. When 
tested at this time with the standard cliff 
situation, all the subjects observed, of these 
three species, showed good discrimination 
of depth; as good, in fact, as the older 
animals of the same species, Perhaps . 
someone would point out that there have 
been a few hours of exposure to visual 
stimulation before the test took place. 
While this was true, any learning which 
occurred in this interval must have been 
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very limited indeed. There could have been 
little or no opportunity to learn through 
reinforcement by falling or through tactual 
and kinesthetic confirmation of different 
surface depths by actual exploration or 
climbing up and down. 

The second case is even clearer. Rats 
reared in the dark discriminated between 
a long and short drop-off 20 minutes after 
removal from the darkroom. Also, Nealey 
and Edwards (1960) repeated this experi- 
ment and had animals adapt in homogen- 
eous light instead of patterned light. In 
all cases, the discrimination was still effec- 
tive, The conclusion, confirming Lashley 
and Russell (1934), seems inescapable, that 
in hooded rats the ability to discriminate 
depth is innate. We cannot assume that 
this conclusion applies to other species. But 
if it is true of some, then theories that 
attempt to explain space perception must 
allow for a built-in mechanism in at least 
one species. 

The dark-reared cats did, of course, pre- 
sent a different picture. They did not 
discriminate 20 minutes after removal from 
the dark, but they did after 3 or 4 days 
of living in the light, in a cage. Did they 
learn during this time, and if so, what? 
The cats were tested on the cliff on suc- 
ceeding days (their only opportunity for 
locomotion outside the cage). Reference 
to Figure 12 shows that the preference for 
the shallow side increased by the seventh 
day in the light to 100%. Why? The ani- 
mals had equal experience descending on 
the two sides in the beginning; according 
to a reinforcement learning theory, they 
should have learned that descent to either 
side was perfectly safe—the glass surfaces 
were identical tactually and kinesthetically. 
How is the visual difference learned, if not 
by confirmation? And if one supposes it is 
learned without differential reinforcement, 
how is the eventual behavioral preference ac- 
quired? Despite the different early experi- 
ence, their preference for the shallow side 
gradually grew to that of the normal animal. 
Tt seems most reasonable, therefore, to adopt 
the hypothesis that retinal processes were 
maturing in these animals which required 
the stimulus of a certain amount of light, 


probably patterned light. If this is learning, 
it does not fit any current definition. 

In the monkey as well, there seems to be 
evidence of maturation after birth. When 
first presented with the cliff (at 10 days 
and 12 days) the two monkeys showed un- 
certain discrimination and actually crossed 
the glass of the deep side. But the behavior 
of the animals over the deep side was 
markedly different from that on the shallow 
half. Two weeks later, nothing (bottle, 
blanket, calling by the experimenter who 
cared for them) would induce them to 
cross. This evidence is, so far, inconclusive 
and investigation must be pushed further 
(perhaps with experience in the situation 
every day), but it suggests a maturing 
visual mechanism rather than learning. 

Evidence from the human infants does 
not carry us much further. They had ample 
opportunity for visual experience by the 
time they were tested. There was no 
evidence that the infants had acquired 
a tendency to avoid a drop-off or to 
fear heights from having fallen off them. 
Younger subjects, barely crawling, dis- 
criminated depth, if they could be tested 
at all, as well as the older’ ones. The 
evidence from the human infants fits in so 
well with other late maturing species as to 
make it plausible to ascribe this depth dis- 
crimination at least in part to built-in 
mechanisms. A learning explanation cannot 
definitely be disproved, but it would need 
to be greatly elaborated to permit any 
testable hypotheses. On the one hand, 
phylogenetic continuity should not be denied. 
On the other hand, the complexity and 
redundancy of stimulation for the dis- 
crimination of depth in the human species 
make any categorical decision premature 
and ill-considered. Practice has been dem- 
onstrated to improve some kinds of distance 
judgment in the human adult (Gibson & 
Bergman, 1954). The interplay of innate 
and learned factors is a problem needing 
study and research. 

Another aspect of the nature-nurture 
question can be examined with regard to 
the rat, We know that the hooded rats 
had a strong innate preference for the 
shallow side of the apparatus. On what 
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differential stimulation could this depend? 
When an attempt was made to separate 
the cues of density difference and motion 
perspective, light-reared rats seemed to 
show a preference when either of these 
alone was present (differing on the two 
sides). But the density difference as such 
elicited no preference in the dark-reared 
rats. Motion parallax isolated from density 
did. The inference is that the density 
gradient had somehow acquired cue value 
(when the animal is looking down) in a 
light-reared animal. The fact that the 1- 
day-old chicks, lambs, and kids did not 
respond preferentially to a difference in 
density alone but did when differential 
motion parallax was present without a 
density difference supports the inference. 

If the cue for the density difference is 
acquired, it should follow that the dark- 
reared rats, after exposure to light, should 
acquire the preference and be similar to 
normal light-reared animals. In order to 
test this prediction a group of dark-reared 
animals was tested twice on density pref- 
erence, once immediately aftér removal 
from the dark and again after а week in 
the light. Similarly, a group of light-reared 
rats of the same age (30 days) was tested 
when the dark-reared were run and again 
one week later. On emergence from the 
dark, the 20 dark-reared subjects went as 
follows: 6 subjects (3096) chose the large 
($^) pattern, 8 subjects (40%) the small 
(1^) pattern, and 6 subjects (30%) did 
not descend from the center board in 5 
minutes. One week later 13 subjects (6596) 
chose the large pattern, 2 subjects (10%) 
the small one, and 5 subjects (2596) did 
not descend from the center board. Among 
the 22 light-reared, 17 subjects (77%) 
chose the large pattern and 5 subjects 
(2396) the small one on the initial run. 
One week later the light-reared split as 
follows: 15 (6896) picked the large pat- 
tern and 7 (3296) the small one. Among 
the dark-reared, then, exposure to light 
leads to behavior similar to that of the 
light-reared, and a second testing experi- 
ence does not change the preference in the 
light-reared. 


| 


It is interesting to hypothesize, from this E 


evidence, that selective response to differ- 
ential motion parallax (or better, motion 
perspective) is built into the rat, the chick, 
and the lamb; but that the step in density, 
unaccompanied by differences caused by 
motion, had acquired cue value for the rat 
on some contingency basis. In the normal 
environment, presumably, when the animal 
looks downward, a change in density ac- 
companies motion perspective and can be- 
come associated with whatever avoidance 
or approach behavior is elicited by the 
motion cue. It has some “ecological va- 
lidity,” to use Brunswik’s term, and thus 
becomes to some extent effective itself. 

Whether the density stimulus alone would 
ever acquire cue status for the goats, sheep, 
and chickens we do not know. Motion cues 
are so important for the ungulates that 
the visual environment without them may 
be totally disregarded. But here we are on 
very unsafe ground. 

Our studies have seemed to show that 
discrimination of depth develops differently 
in different species, in the rat and the cat, 
for instance. Development continues after 
birth and requires a certain amount of light 
in the cat (and probably the monkey). To 
account for this development by a meres 
reference to learning appears naive indeed. 
What kind of learning? What kind of 
opportunity? The latter question, at least, 
is open to experimental investigation. 


Fear and Loss of Support 


The second topic we wished to discuss 
was the relation of behavior on the cliff to 
fear and to loss of support. A brink is 
avoided and it is highly adaptive for an 
organism to avoid it. Does our research 
throw any light on the fear of a falling-off- 
place? 

While our research technique may be 
based on the assumption that a high place 
is dangerous, the simple choice of a “shal- 
low” in preference to a “deep” side does 
not enable us to conclude that fear is the 
basis for such a choice. The threshold 
experiment, where fewer animals descended 
as height was increased, is the only ex- 


periment that directly relates increasing 
height with an increasing tendency to avoid 
a drop-off, 

However, we investigated the phenome- 
non of “optical support” (or “visual sup- 
port") more directly by placing animals on 
the glass of the deep side, on a transparent 
surface 40” above the floor. Optical support 
might Бе defined as a relatively coarse 
optical texture of the array surrounding 
the animal’s feet. Optical support, so 
defined, is usually accompanied by physical 
support; that is, by the ordinary stimula- 
tion of the vestibules, muscles, and skin, 
caused by gravity and the substratum, which 
controls equilibrium and posture. But our 
method permits the elimination of optical 
(visual) support without affecting physical 
/ support. This test served two purposes. 

First, the reaction of the animal to the loss 
of visual support can be observed and, 
Seg second, such a test permits a study of the 

‘importance of vision to the species investi- 

gated, 
_ The strongest reaction to the lack of 
visual support was observed in the ungu- 
lates (lambs and kids). The noses of 
these animals bumped the surface as they 
were lowered onto the glass and they exe- 
cuted a reflex-like backing response accom- 
anied by forelimb rigidity. This reflexive 
postural attitude could only be removed by 
the addition of visual support to the surface 
directly under the animal. This reflex in 
‘these animals is as strong and stereotyped 
as were Sherrington’s (1906) reflexes. 

The cats reacted to the loss of visual 
support by circling backwards and mewing 
until a visual surface was reached. Puppies 
usually backed up, too, after a period of 
immobility with trembling in some cases. 
The monkeys lay flat, apparently frightened, 
above the visual void. Few systematic 
observations were made of human infants 
but the ones carried out suggest that the 
human infant is uneasy without some vis- 
ual support. Chickens exhibited a peculiar 
high stepping gait when walking on the 
glass of the deep side. Rats, on the other 
hand, and aquatic turtles behaved over the 
void much as they did over a solid surface. 
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Observations on the lack of visual sup- 
port were not quantified, as were those on 
descent from the board. This is because 
the observations were carried out only in- 
cidently to the main line of investigation. 
But, tentatively, we may offer these hy- 
potheses: 

l. An animal's response to the lack of 
visual support is unlearned, a reflex that 
is characteristic of its species. 

2. An animal's response to the lack of 
visual support is related to its way of life. 
Visual cues are least important to the rat, 
intermediate in the kitten, and relatively 
more important to the ungulate. 

However, more systematic research is 
needed on reactions to the loss of visual 
support. 


Comparison of Depth Discrimination їп 
Different Species 


The question of species comparison has 
been central to the whole plan of this 
investigation. Our studies have made it 
clear, if it was not so already, that 
such comparison is not very enlightening 
on a quantitative basis alone (such as 
thresholds). АП the species tested on the 
cliff showed some discrimination between 
the shallow and the deep side, even the 
aquatic turtle. This is our main conclusion, 
and in this respect all species were similar. 
But this is not to say that there were no 
differences among the species. Which kinds 
of potential stimuli are actually effective, 
and under what conditions, undoubtedly 
varies with the species. 

The term “effective” covers a number of 
problems. What determines whether a 
stimulus is effective or not? First of all, 
builtin structures; an animal cannot make 
use of binocular disparity unless it has two 
eyes, overlapping fields of view, conver- 
gence movements to bring the latter to- 
gether, and so on. Sometimes maturation 
to a given stage must be considered, as 
witness the difference between the chick 
and the rat or cat at birth. Again, condi- 
tions of maturation making for "normal" 
stimulus effectiveness may vary, as evi- 
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denced by the apparent necessity for some 
light in the environment of the cat, as 
against the rat. In fact, the kind of light 
in the environment, and the range of sur- 
faces and objects visible may make a dif- 
ference in later discrimination (e.g., depth 
of field of view, in the case of hooded rats 
tested for preference of texture densities). 
Perhaps even specific. reinforcement, or 
feedback from motor performance can 
make a difference in some species, though 
we have found no evidence for it. 
Effectiveness of stimuli also varies, 
clearly, with the species’ normal environ- 
ment and way of life. A rat must be 
forced to make a choice on the basis of 
visual cues, but for the goat or chick, even 
at birth such a choice seems to be natural. 
It is necessary, therefore, to take a lesson 
from the ethologists and consider how the 
kind of discrimination to be compared fits 
with the ecology and biological require- 
ments of the species—its method of repro- 
duction, defense, food-getting and territorial 
adjustment. A real comparative psychology 
will only be written in such a context. 
One lesson for theories of perception can 
perhaps-be drawn from this research be- 


cause it permitted a comparative survey. 


The old treatises on “tepth ‘pe*cention at- 
tempted to analyze | points in 
an abstract geome cai nipa But such 
success as we have achieved seems to us 
to be founded on the treatment of space 
in terms of surfaces, depth at an edge, 
density, and differential motions produced 
by the animals’ own actions. Differences 
in stimulation described in such terms can 
also be meaningful in describing ecological 
differences correlated with species differ- 
ences. Air, water, and grass covéred earth 
can be described i: hese ways, and are 
certainly of critical importance in comparing 
differences in visual discrimination between 
birds, aquatic animals, and terrestrial ones. 

Finally, we can suggest that the nativism- 
empiricism controversy be abandoned as 
such, with the aim of restating the problems 
of development more specifically. They 
should be stated in terms of the species 
under consideration, its environment and 
means of adjusting to it, and especially in 


terms of the information provided by the m 
environment for this adjustment. Whether? 
the animal has the necessary receptor mech- 
anisms for picking up the information - 
input, whether growth in a special kind of 
environment is required, and whether learn- 
ing on either a contingency or reinforcement 
basis is required for making the potential 
input effective are all questions which can be 
answered in the laboratory. 


SuMMARY AND CONCLUSION 


The experiments which have been 
described made use of an optical testing 
situation which permitted comparative stud- 
ies, and which allowed the same essential 
stimulus variables to be applied to a number 
of different animals. 

All of the animals studied gave some 
evidence of discriminating depth at an 
edge. Even the aquatic turtles tended in, 
general to avoid the deep side, though the 
preference was not as pronouriced as in the 
other species tested, all of which меге 
terrestrial. The discrimination of depth 
may be less important for some species 
than for others, and also less acute in some 
species: than-in others. ^ 

In general, all of the animals studied 
discriminated visual depth by the time 09 
comotion was possible, but this time varied ^] 
widely, even among terrestrial species. i 

Analysis oft cues оп which«the pref- 
erensé depended suggested that motion 
perspective is more important than density. 
perspective for the animals in whom it was | 
experimentally isolated. In the hooded rat, 
it appears to be innate as well, since dark- 
reared animals discriminated between а 
deep and a shallow surface when motion 
perspective was probably the sole basis for 
differentiation. ; 

The results in general support a hypoth- 
esis of innate depth perception, though 
the presence of a certain kind of environ- - 
ment during growth may be important for - 
late maturing animals. Furthermore, it has] 
been shown that innate mechanisms forg 
discriminating depth may be supplemented 
by the acquisition of a learned cue. 


Jd 
Brown, F. A, Jg. The rhythmic nature of ani- 
mals and plants. Amer. Scientist, 1959, 47, 


147-168. 


Brunswik, E. Perception and the representative 
design of psychological experiments, Berkeley: 
Univer. California Press, 1956. 


Сноу, К. L., Resen, А. H., & Newe, F. W. 
Degeneration of retinal ganglion cells in infant 
chimpanzees reared in darkness. J. comp, Neurol., 
1957, 107, 27-42. 


" CRUIKSHANK, R. M. The development of visual 
size constancy in early infancy. J. genet. 
Psychol., 1941, 58, 327-351. 


_ Denis-PrrnzHorn, MARIANNE. Perceptions des dis- 
tances et constance des grandeurs: Etude gene- 
tique. Arch, Psychol, Geneve, 1960, 37 (Whole 
No, 147), 181-309. 


— Duxe-Etper, S. The eye in evolution. In S. 
Duke-Elder (Ed.), System of ophthalmology 
Vol. 1. St. Louis: Mosby,- 1958. 


__ Forcays D, G., & Forcays, J. W. The nature of 
the effect of free-environmental experience in 
the rat. J. comp. physiol Psychol, 1952, 45, 
322-328. 

IBSON, ELEANOR J. The role of shock in rein- 


forcement. J. comp. physiol. Psychol., 1952, 45, 
:18-30. 


Geson, J. J. Visually controlled locomotion and 
_ visual orientation in animals. Brit. J. Psychol., 
1958, 49, 182-194. 


Gisson, J. J., OLUM, P., & RoseNBLATT, F. Paral- 
lax and perspective during aircraft landings. 
Amer. J. Psychol, 1955, 68, 372-385. 


GREENHUT, ANN M. Visual distance discrimina- 


tion in the rat. J. exp. Psychol, 1954, 47, 
148-152. 


REENHUT, ANN M., & Үоомс, Е. A. Visual depth 
perception in the rat. J. genet. Psychol., 1953, 
82, 155-182. 


VISUAL DEPTH PERCEPTION: A Wit- 


ER 
ADT 


Jen 


^ 


“ REFERENCES 


HENDRICKS, S. B. Control of growth and repro- 
duction by light and darkness. Amer. Scientist, 
1956, 44, 229-247. 


Hess, E. Н. Space perception in the chick. Scient. 
Amer., 1956, 195, 71-80. 


Носнвевс, J. E. Nature and nurture in visual 
perception, In L. Postman (Ed.), Psychological 
research: Illustrative histories. New York: 
Knopf, in press. Y) 


HyMovrrcH, B. The effects of experimental, varia- 
tions on problem solving in the rat. J. comp. 
physiol. Psychol, 1952, 45, 313-321. 


Jounson, B., & Веск, F. L. The development of 
space perception: I. Stereoscopic vision in pre- 
school children, J. genet. Psychol, 1941, 58, 
247-254. 


Кокке, M. I. The role of motor experience in the 
visual discrimination of depth in the chick, J. 
genet. Psychol., 1955, 86, 191-196. 


Lasutey, K. S., & Russett, J. T. The mechanism 
of vision: XI. A preliminary test of innate 
organization. J. genet. Psychol., 1934, 45, 136- 
144. ` 


Мїзнктх, М, Gunxet, R. D, & Rosvorp H, Е. 
Contact occluders: A method for restricting s” 
vision in animals, Science, 1959, 129, 1220-1221. 


a Neatey, S, M, & Epwarps, Barsard J; "Depth ' 


in rats without pattern vision ex- 


INS L. Visual movement 
on in cats after early 
ion. J. comp. physiol. 


ferc, М. R. Echolocation"., 
Bol Psychol, 1957, 50, 


Romwsow, E. W. E E. б. Visual dis- 
tance perception in the rat. U. Calif. Publ. Psy- Эў 
chol., 1930, 4, 233-239. 9 © 


Russett, J. T. Depth discrimination in the rat. 
J. genet. Psychol., 1932, 40, 136-159. 


SHERRINGTON, C. The integrative action of the 
nervous system. New Haven: Yale Univer. 
Press, 1906. 


Ѕієсег, A. I. Deprivation of visual form definition 
in the ring dove: I. Discrimination learning. 
J. comp. physiol. Psychol, 1953, 46, 115-119. 


Spatpinc, D. А. Instinct and acquisition. Nature, 
Lond., 1875, 12, 507-508. 


А рай» 


Тновмике, Е. The instinctive reactions of young 

"chicks. Psychol. Rev., 1899, 6, 282-291. 
-"A&smEcRAFEF, R. The visual perception of distance 
g children and adults: A comparative 
Cia. Ia. Stud. child Welf., 1930, 4, No. 4. 
Van шут, M. C. Monocular perception of dis- 

tance. Amer. J. Psychol., 1937, 49, 515-542. 
WALK, R. D., Geson, ELEANOR J., & TIGHE, 7: 
vior of light- and dark-reared rats on а 
cliff. Science, 1957, 126, 80-81. 
„ G L. The vertebrate eye. Bloomfield 
s, Mi Cranbrook Institute of Science, 
_ 1942. TON 
WARKENTIN, "J 

ment of vis 


» SmitH, К. U. The develop- 
acuity in the cat. J. genet. 


Psychol., 1937, 50, 371-399. 


RICHARD D. WALK ann ELEANOR J. GIBSON 


Warson, J. B. Psychology from the standp * 
of a behaviorist. Philadelphia: Lippincott, 19 


Waves, К. T. The role of vision in the 
life of the mouse. J. сотр. neurol. Psy 
1910, 20, 549-599. У 


Winnie, C. D, War, J. S, NEDVED, K, 


NATHAN, J. The effect of mock tower hei 
in airborne training. HumRRO tech. Re, 


YerKes, R. M. Space perception of 
J. comp. neurol. Psychol., 1904, 4, 17-26. | 


(Received November 14, 1960) > 


